Reputation: 1340
Is there any standard search algorithm for human-readable text strings in a random binary file?
For example processing of an executable file should return list of some function names from import table and string constants.
It's obviously must utilize set of language-specific dictionaries and be based on statistical theory.
Upvotes: 0
Views: 102
Reputation: 5040
You could use a Hidden Markov model. For both the binary and the text data, you create a model that describes how likely any byte is given the preceding or a few preceding bytes. Given also the probability of switching from one model to the other, the Viterbi algorithm can find the most likely underlying alternation of binary and text.
Upvotes: 1