user2875617
user2875617

Reputation:

Matlab. Find the indices of a cell array of strings with characters all contained in a given string (without repetition)

I have one string and a cell array of strings.

str = 'actaz';
dic = {'aaccttzz', 'ac', 'zt', 'ctu', 'bdu', 'zac', 'zaz', 'aac'};

I want to obtain:

idx = [2, 3, 6, 8];

I have written a very long code that:

  1. finds the elements with length not greater than length(str);
  2. removes the elements with characters not included in str;
  3. finally, for each remaining element, checks the characters one by one

Essentially, it's an almost brute force code and runs very slowly. I wonder if there is a simple way to do it fast.

NB: I have just edited the question to make clear that characters can be repeated n times if they appear n times in str. Thanks Shai for pointing it out.

Upvotes: 2

Views: 1819

Answers (2)

Mohsen Nosratinia
Mohsen Nosratinia

Reputation: 9864

You can sort the strings and then match them using regular expression. For your example the pattern will be ^a{0,2}c{0,1}t{0,1}z{0,1}$:

u = unique(str);
t = ['^' sprintf('%c{0,%d}', [u; histc(str,u)]) '$']; 
s = cellfun(@sort, dic, 'uni', 0);
idx = find(~cellfun('isempty', regexp(s, t)));

Upvotes: 1

P0W
P0W

Reputation: 47784

I came up with this :

>> g=@(x,y) sum(x==y) <= sum(str==y); 
>> h=@(t)sum(arrayfun(@(x)g(t,x),t))==length(t);
>> f=cellfun(@(x)h(x),dic);
>> find(f)

ans =

     2     3     6
  • g & h: check if number of count of each letter in search string <= number of count in str.
  • f : finally use g and h for each element in dic

Upvotes: 1

Related Questions