smonsays
smonsays

Reputation: 420

Calculate Hamming distance between strings of variable length in Matlab

I would like to calculate the Hamming distance between two strings of variable length in Matlab. For fixed length strings the following syntax solves my problem:

str1 = 'abcde';
str2 = 'abedc';

sum(str1 ~= str2)

ans = 2

How can I do this efficiently for variable length strings?

Thank you!

EDIT: Because this is a legitimate question: For every character one string is longer then the other, the Hamming distance should be incremented. So for example for

str1 = 'abcdef';
str2 = 'abc';

The answer should be 3.

Upvotes: 1

Views: 3434

Answers (2)

user2999345
user2999345

Reputation: 4195

although @LuisMendo answer works for the given example (which might be good enough for you) it will not work for this one:

str1 = 'abcdef';
str2 = 'bcd';
clear t
t(1,:) = str1+1; % +1 to make sure there are no zeros
t(2,1:numel(str2)) = str2+1; % if needed, this right-pads with zero or causes t to grow
result = sum(t(1,:)~=t(2,:)) % result = 6

to make sure that even if the shorter string appears in the middle of the longer one you should check all options. one way to do that is:

str1 = 'bcd';
str2 = 'abcdef';
len1 = length(str1);
len2 = length(str2);
n = len2 - len1;
str1rep_temp = repmat(str1,[1,n+1]);
str1rep = -ones(n+1,len2);
str1rows = repmat(1:n+1,[len1,1]);
str1cols = bsxfun(@plus,(1:len1)',0:n);
str1idxs = sub2ind(size(str1rep),str1rows(:),str1cols(:));
str1rep(str1idxs) = str1rep_temp;
str2rep = double(repmat(str2,[n+1, 1]));
res = min(sum(str1rep ~= str2rep,2)); % res = 3

Upvotes: 1

Luis Mendo
Luis Mendo

Reputation: 112659

Here's a way to do it:

str1 = 'abcdef';
str2 = 'abc';
clear t
t(1,:) = str1+1; % +1 to make sure there are no zeros
t(2,1:numel(str2)) = str2+1; % if needed, this right-pads with zero or causes t to grow
result = sum(t(1,:)~=t(2,:));

Upvotes: 2

Related Questions