Log anonymization

One of the features of Collabora Online has been a way to ensure that log files (of reasonable verbosity) don’t include any personally identifying information. This allows logs to be shared by customers and partners with confidence to allow debugging unusual problems. This is done with a one-way hash, and to avoid dictionary attacks that can be salted with a random salt.

Since there is typically not a lot of personal data that goes through this process, we already store the anonymized versions in a hash-map to improve performance in future use, previously we used a fast and fairly good FNV-1a algorithm, but with this optional higher strength anonymizer commit we can now use a much stronger PBKDF2-HMAC-SHA512 hash for this data at some smallish CPU cost when logging.

This means customers can be confident that when sharing their logs they are not sharing personally identifying information.

Leave a Reply