Reputation: 1234
I am trying to use the MATLAB Coder toolbox to convert the following code into C:
function [idx] = list_iterator(compare, list)
idx = nan(length(list));
for j = 1:length(list)
idx(j) = strcmp(compare, list{j});
end
list
is an N x 1 cell array of strings and compare
is a string. The code basically compares each element of list
to compare
and returns 1
if the two are the same and 0
otherwise. (I'm doing this to speed up execution because N can be quite large - around 10 to 20 million elements.)
When I run codegen list_iterator
in the Command Window, I get the following error:
Type of input argument 'compare' for function 'list_iterator' not specified. Use -args or preconditioning statements to specify input types.
More information
Error in ==> list_iterator Line: 1 Column: 18
Code generation failed: View Error Report
Error using codegen
I know I'm supposed to specify the types of the inputs when using codegen
, but I'm not sure how to do this for a cell array of strings, the elements of which can be of different length. The string compare
can also have different lengths depending on the function call.
Upvotes: 1
Views: 599
Reputation: 1928
You can use the function coder.typeof
to specify variable-size inputs to codegen
. From what I've understood of your example, something like:
>> compare = coder.typeof('a',[1,Inf])
compare =
coder.PrimitiveType
1×:inf char
>> list = coder.typeof({compare}, [Inf,1])
list =
coder.CellType
:inf×1 homogeneous cell
base: 1×:inf char
>> codegen list_iterator.m -args {compare, list}
seems appropriate.
If you check out the MATLAB Coder App, that provides a graphical means of specifying these complicated inputs. From there you can export this to a build script to see the corresponding command line APIs:
Note that when I tried this example with codegen
, the resulting MEX was not faster than MATLAB. One reason this can happen is because the body of the function is fairly simple but a large amount of data is transferred from MATLAB to the generated code and back. As a result, this data transfer overhead can dominate the execution time. Moving more of your code to generated MEX may improve this.
Thinking about the performance unrelated to codegen
, should you use idx = false(length(list),1);
rather than idx = nan(length(list));
? The former is a Nx1 logical vector while the latter is an NxN double matrix where we only write the fist column in list_iterator
.
With your original code and the inputs compare = 'abcd'; list = repmat({'abcd';'a';'b'},1000,1);
this gives the time:
>> timeit(@()list_iterator(compareIn, listIn))
ans =
0.0257
Modifying your code to return a vector scales that down:
function [idx] = list_iterator(compare, list)
idx = false(length(list),1);
for j = 1:length(list)
idx(j) = strcmp(compare, list{j});
end
>> timeit(@()list_iterator(compareIn, listIn))
ans =
0.0014
You can also call strcmp
with a cell and char array which makes the code faster still:
function [idx] = list_iterator(compare, list)
idx = strcmp(compare, list);
>> timeit(@()list_iterator(compareIn, listIn))
ans =
2.1695e-05
Upvotes: 2