Alex
Alex

Reputation: 1367

Simple way to detect encryption

Is there a simple and quick way to detect encrypted files? I heard about enthropy calculation, but if I calculate it for every file on a drive, it will take days to detect encryption.

Is it possible to, say it, calculate some value for first 100 bytes or 1024 bytes and then decide? Anyone has a sources for that?

Upvotes: 4

Views: 16838

Answers (3)

Craig
Craig

Reputation: 31

I would use a cross-entropy calculation. Calculate the cross-entropy value for X bytes for known encrypted data (it should be near 1, regardless of type of encryption, etc) - you may want to avoid file headers and footers as this may contain non-encrypted file meta data.

Calculate the entropy for a file; if it's close to 1, then it's either encrypted or /dev/random. If it's quite far away from 1, then it's likely not encrypted. I'm sure you could apply signifance tests to this to get a baseline.

It's about 10 lines of Perl; I can't remember what library is used (although, this may be useful: http://dingo.sbs.arizona.edu/~hammond/ling696f-sp03/addonecross.txt)

Upvotes: 3

schnaader
schnaader

Reputation: 49729

One of the advantages of good encryption is that you can design it so that it can't be detected - see the Wikipedia article on deniable encryption for example.

Every statistical approach to detect encryption will give you various "false alarms", like compressed data or random looking data in general.

Imagine I'd write a program that outputs two files: file1 contains 1024 bit of π and file2 is an encrypted version of file1. If you don't know anything about the contents of file1 or file2, there's no way to distinguish them. In fact, it's quite likely that π contains the contents of file2 somewhere!

EDIT:

By the way, it's not even working the other way round (detecting unencrypted files). You could write a program that transforms encrypted data to readable english text by assigning words or whole sentences to bits/bytes of it.

Upvotes: 0

Thomas M. DuBuisson
Thomas M. DuBuisson

Reputation: 64740

You could just make a system that recognizes particular common forms of encrypted files (ex: recognize encrypted zip, rar, vim, gpg, ssl, ecryptfs, and truecrypt). Any attempt to determine encryption based on the raw data will quickly run into a steganography discussion.

Upvotes: 2

Related Questions