jmlopez
jmlopez

Reputation: 4953

Efficient String Concatenation in Matlab

A while ago I stumbled upon this document. It assess the performance of several concatenating methods in python. Here are the 4 out of the 6 methods that it compares:

Python String Concatenation Methods

Method 1: Naive appending

def method1():
    out_str = ''
    for num in xrange(loop_count):
        out_str += `num`
    return out_str

Method 4: Build a list of strings, then join it

def method4():
  str_list = []
  for num in xrange(loop_count):
    str_list.append(`num`)
  return ''.join(str_list)

Method 5: Write to a pseudo file

def method5():
    from cStringIO import StringIO
    file_str = StringIO()
    for num in xrange(loop_count):
        file_str.write(`num`)
    return file_str.getvalue()

Method 6: List comprehensions

def method6():
    return ''.join([`num` for num in xrange(loop_count)])

The conclusions from the results go as follows:

I would use Method 6 in most real programs. It's fast and it's easy to understand. It does require that you be able to write a single expression that returns each of the values to append. Sometimes that's just not convenient to do - for example when there are several different chunks of code that are generating output. In those cases you can pick between Method 4 and Method 5.

After reading this document I realized that I was not aware of methods 5 and 6. For the most part, I now prefer to use method 5 since it allows me to write to a string in the same way as I would to a file.

My question is the following, what are the different techniques in matlab for string concatenation? I hardly deal with strings in matlab but I have come up with a problem that requires me to write a string. One solution I was thinking of was to write to a temporary file and read the file once this is done. But before doing this I decided to ask and see if there are better options. For now here is a naive appending method in matlab:

Matlab String Concatenation Methods

Method 1: Naive appending

function out_str = method1(loop_count)
    out_str = '';
    for num=1:loop_count
       out_str = [out_str num2str(num)]; %#ok<AGROW>
    end
end

Are there similar methods in Matlab to method 4, 5, and 6 that we can use for efficiency comparison?

EDIT:

Here is some method similar to method 5 in python (writing to a file):

function out_str = method2(loop_count)
    fid = fopen('._tmpfile.tmp', 'w');
    for num=1:loop_count
        fprintf(fid, '%d', num);
    end
    fclose(fid);
    out_str = fileread('._tmpfile.tmp');
end

And this is a simple test:

>> tic; tmp1 = method1(100000); toc
Elapsed time is 13.144053 seconds.
>> tic; tmp2 = method2(100000); toc
Elapsed time is 2.358082 seconds.

Upvotes: 4

Views: 1913

Answers (2)

Dennis Jaheruddin
Dennis Jaheruddin

Reputation: 21563

In general there is a fast way to grow a vector concatenation that is not mentioned very often. This is a clear example (concatenates numbers, but characters are treated as numbers as well in matlab):

%What you will typically find in sample code
loop_count=1e4
out_str = [];
tic
for num=1:loop_count
    out_str = [out_str num]; %#ok<AGROW>
end
toc

% What typically runs faster
out_str = [];
tic
for num=1:loop_count
    out_str(end+1) = num; 
end
toc

Will give

Elapsed time is 0.077540 seconds.
Elapsed time is 0.004776 seconds.

Of course, the game changes if you already know everything you are going to concatenate before you start. Suppose you want to concatenate the string representations of numbers in a vector:

%Vectorized code typically runs fastest
v =1:loop_count
M=[num2str(v)];
tic
M=M(~M==' ');
toc

Will give

Elapsed time is 0.001903 seconds.

Upvotes: 2

erikced
erikced

Reputation: 732

Since Matlab prefers to perform vectorized operations on arrays and is inefficient when using for loops, the best performing general solution would be to create a cell array with all your strings, and combine them using [str_array{:}] or join using strjoin sprintf (see below) depending on your needs.

For certain operations like creating a comma-separated string from an array of numbers there are more efficient solutions such as

numeric_array = rand(1 ,100000);
out_str = sprintf('%d,', numeric_array);
out_str = out_str(1:end-1);

since it performs both string conversion and concatenation at once.

As a side note out_str = sprintf('%s ', str_array{:});out_str = out_str(1:end-1) is about ten times faster than strjoin(str_array{:}) on my computer with Matlab 2013a.

Upvotes: 2

Related Questions