Reputation: 847
Given two strings, I want to find all common substrings of a specified length, but allowing one character to be different.
For example, if s1 is 'ATCAGC'
, s2 is 'ATAATCGAC'
, and the specified length is 3
, then I'd want output along these lines:
ATC from s1 matches ATA, ATC from s2
TCA from s1 matches TAA, TCG from s2
Questions
Upvotes: 3
Views: 563
Reputation: 4261
First, google result for "perl hamming distance" found a perlmonks thread that mentions Text::LevenshteinXS, various typical implementations, and a cute xor trick :
sub hd{ length( $_[ 0 ] ) - ( ( $_[ 0 ] ^ $_[ 1 ] ) =~ tr[\0][\0] ) }
You should skim wikipedia article on String metrics if Levenshtein distance or Hamming distance aren't familiar.
Upvotes: 3