phdstudent
phdstudent

Reputation: 1144

Saving a string array to CSV

I have a 25194081x2 matrix of strings called s1. Below an outlook of how the data looks like.

enter image description here

I am trying to save this matrix to csv. I tried the code below but for some reason it saves the first column of the vector twice (side by side) instead of the two columns.

What am I doing wrong?

fileID= fopen('data.csv', 'w') ;
 fprintf(fileID, '%s,%s\n', [s1(:,1) s1(:,2)]);
 fclose(fileID)

Upvotes: 0

Views: 1960

Answers (2)

Hoki
Hoki

Reputation: 11792

My version of MATLAB (R2016a) does not have the string type available yet, but your problem is one I was having regularly with cell arrays of character vectors. The trick I was using to avoid using a loop for fprintf should be applicable to you.

Let's start with sample data as close to yours:

s1 = {'2F5E8693E','al1 1aj_25';
      '3F5E8693E','al1 1aj_50';
      '3F5E8693E','al1 1aj_50';}

Then this code usually executed much faster for me than having to loop on the matrix for writing to file:

% step 1: transpose, to get the matrix in the MATLAB default column major order
s  = s1.' ;

% step 2 : Write all in a long character array
so = sprintf('%s, %s\n', s{:} ) ;

% step 3 : write to file in one go (no need loop)
fid = fopen('data.csv', 'w') ;
fprintf(fid,'%c',so) ;
fclose(fid) ;

The only step slightly different for you might be step 2. I don't know if this syntax will work on a matrix of string as good on a cell array of characters, but I'm sure there is a way to get the same result: a single long vector of characters. Once you get that, fprintf will be uber fast to write it to file.

note: If the amount of data is too large and with a limited memory you might not be able to generate the long char vector. In my experience, it was still faster to use this method in chuncks (which would fit in memory) rather than looping over each line of the matrix.

Upvotes: 0

rinkert
rinkert

Reputation: 6863

Dont merge the columns to a string array like you do now, but provide them as separate arguments, and loop over the rows of s1:

fileID= fopen('data.csv', 'w') ;
for k = 1:size(s1,1)
    fprintf(fileID, '%s,%s\n', s1(k,1), s1(k,2));
end
fclose(fileID)

Or, if you're using >R2019a, you can use writematrix:

writematrix(s1, 'data.csv');

Upvotes: 2

Related Questions