Reputation: 59
I have a series of lines that I read from a file (over 2700) of this type:
A = '1; 23245675; -234567; 123456; ...; 0'
A
is a string with ;
as the delimiter for data.
To split the string I used the strsplit
function first, but it was too slow to execute. Then I used regexp
like this:
regexp(A,';','split')
Is there an even faster function than regexp
?
Upvotes: 1
Views: 1505
Reputation: 10440
Being a builtin function1, textscan
is probably the fastest option:
result = textscan(A{1},'%f','Delimiter',';');
Here is a little benchmark to show that:
A = repmat('1; 23245675; -234567; 123456; 0',1,100000); % a long string
regexp_time = timeit(@ () regexp(A,';','split'))
strsplit_time = timeit(@ () strsplit(A,';'))
split_time = timeit(@ () split(A,';'))
textscan_time = timeit(@ () textscan(A,'%f','Delimiter',';'))
the result:
regexp_time =
0.33054
strsplit_time =
0.45939
split_time =
0.24722
textscan_time =
0.057712
textscan
is the fastest, and is ~4.3 times faster than the next method (split
).
It is the fastest option no matter what is the length of the string to split (Note the log scale of the x-axis):
1"A built-in function is part of the MATLAB executable. MATLAB does not implement these functions in the MATLAB language. Although most built-in functions have a .m file associated with them, this file only supplies documentation for the function." (from the documentation)
Upvotes: 2
Reputation: 1556
The possible split function I can think about are regexp
, strsplit
, and split
.
I compared the performance of them for a large string. The result shows split
is slightly faster while strsplit
is around 2 times slower than regexp
.
Here is how I compared them:
First, create a large string A (around 16 million data) according to your question.
A = '1; 23245675; -234567; 123456; 0';
for ii=1:22
A = strcat(A,A);
end
Option 1: regexp
tic
regexp(A,';','split');
toc
Elapsed time is 12.548295 seconds.
Option 2: strsplit
tic
strsplit(A,';');
toc
Elapsed time is 23.347392 seconds.
Option 3: split
tic
split(A,';');
toc
Elapsed time is 9.678433 seconds.
So split
might help you speed up a little bit but it is not obvious.
Upvotes: 0