Reputation:
I have one string and a cell array of strings.
str = 'actaz';
dic = {'aaccttzz', 'ac', 'zt', 'ctu', 'bdu', 'zac', 'zaz', 'aac'};
I want to obtain:
idx = [2, 3, 6, 8];
I have written a very long code that:
Essentially, it's an almost brute force code and runs very slowly. I wonder if there is a simple way to do it fast.
NB: I have just edited the question to make clear that characters can be repeated n times if they appear n times in str. Thanks Shai for pointing it out.
Upvotes: 2
Views: 1819
Reputation: 9864
You can sort the strings and then match them using regular expression. For your example the pattern will be ^a{0,2}c{0,1}t{0,1}z{0,1}$
:
u = unique(str);
t = ['^' sprintf('%c{0,%d}', [u; histc(str,u)]) '$'];
s = cellfun(@sort, dic, 'uni', 0);
idx = find(~cellfun('isempty', regexp(s, t)));
Upvotes: 1
Reputation: 47784
I came up with this :
>> g=@(x,y) sum(x==y) <= sum(str==y);
>> h=@(t)sum(arrayfun(@(x)g(t,x),t))==length(t);
>> f=cellfun(@(x)h(x),dic);
>> find(f)
ans =
2 3 6
str
.dic
Upvotes: 1