Reputation: 4953
A while ago I stumbled upon this document. It assess the performance of several concatenating methods in python. Here are the 4 out of the 6 methods that it compares:
def method1():
out_str = ''
for num in xrange(loop_count):
out_str += `num`
return out_str
def method4():
str_list = []
for num in xrange(loop_count):
str_list.append(`num`)
return ''.join(str_list)
def method5():
from cStringIO import StringIO
file_str = StringIO()
for num in xrange(loop_count):
file_str.write(`num`)
return file_str.getvalue()
def method6():
return ''.join([`num` for num in xrange(loop_count)])
The conclusions from the results go as follows:
I would use Method 6 in most real programs. It's fast and it's easy to understand. It does require that you be able to write a single expression that returns each of the values to append. Sometimes that's just not convenient to do - for example when there are several different chunks of code that are generating output. In those cases you can pick between Method 4 and Method 5.
After reading this document I realized that I was not aware of methods 5 and 6. For the most part, I now prefer to use method 5 since it allows me to write to a string in the same way as I would to a file.
My question is the following, what are the different techniques in matlab for string concatenation? I hardly deal with strings in matlab but I have come up with a problem that requires me to write a string. One solution I was thinking of was to write to a temporary file and read the file once this is done. But before doing this I decided to ask and see if there are better options. For now here is a naive appending method in matlab:
function out_str = method1(loop_count)
out_str = '';
for num=1:loop_count
out_str = [out_str num2str(num)]; %#ok<AGROW>
end
end
Are there similar methods in Matlab to method 4, 5, and 6 that we can use for efficiency comparison?
EDIT:
Here is some method similar to method 5 in python (writing to a file):
function out_str = method2(loop_count)
fid = fopen('._tmpfile.tmp', 'w');
for num=1:loop_count
fprintf(fid, '%d', num);
end
fclose(fid);
out_str = fileread('._tmpfile.tmp');
end
And this is a simple test:
>> tic; tmp1 = method1(100000); toc
Elapsed time is 13.144053 seconds.
>> tic; tmp2 = method2(100000); toc
Elapsed time is 2.358082 seconds.
Upvotes: 4
Views: 1913
Reputation: 21563
In general there is a fast way to grow a vector concatenation that is not mentioned very often. This is a clear example (concatenates numbers, but characters are treated as numbers as well in matlab):
%What you will typically find in sample code
loop_count=1e4
out_str = [];
tic
for num=1:loop_count
out_str = [out_str num]; %#ok<AGROW>
end
toc
% What typically runs faster
out_str = [];
tic
for num=1:loop_count
out_str(end+1) = num;
end
toc
Will give
Elapsed time is 0.077540 seconds.
Elapsed time is 0.004776 seconds.
Of course, the game changes if you already know everything you are going to concatenate before you start. Suppose you want to concatenate the string representations of numbers in a vector:
%Vectorized code typically runs fastest
v =1:loop_count
M=[num2str(v)];
tic
M=M(~M==' ');
toc
Will give
Elapsed time is 0.001903 seconds.
Upvotes: 2
Reputation: 732
Since Matlab prefers to perform vectorized operations on arrays and is inefficient when using for loops, the best performing general solution would be to create a cell array with all your strings, and combine them using [str_array{:}]
or join using strjoin
sprintf
(see below) depending on your needs.
For certain operations like creating a comma-separated string from an array of numbers there are more efficient solutions such as
numeric_array = rand(1 ,100000);
out_str = sprintf('%d,', numeric_array);
out_str = out_str(1:end-1);
since it performs both string conversion and concatenation at once.
As a side note out_str = sprintf('%s ', str_array{:});out_str = out_str(1:end-1)
is about ten times faster than strjoin(str_array{:})
on my computer with Matlab 2013a.
Upvotes: 2