Shika93
Shika93

Reputation: 659

Convert txt file to excel file with headers in Matlab

I'm struggling on splitting a txt file made of 1 column of N blocks made of 1000 rows. The first 3 rows of each block is an header, followed by 1000 rows of data. Row 1001 will be the next block starting with the three headers in this way:

#Header1
#Header2_1,Header2_2
#Header3
Data1_1-Data1_2
...
Data1000_1-Data1001_2
Header1
Header2
Header3
Data1002_1-Data1002_2
...
Data2003_1-Data2003_2
:::

Each row of data is made of two sets, so I'd like to split into two columns the data, keeping each header in this way

         Block1           |           Block2             |       Block3 ... 
Header1                   | Header1
Header2 1_1 | Header2 1_2 | Header2 2_1 | Header2 2_2
Header3                   | Header3
Data1_1 | Data1_2         | Data1002_1 | Data1002_2
...                       | ...
Data1001_1 | Data1001_2   | Data2003_1 | Data2003_2

I tried with reshape but without success:

filename = 'text.txt';
data = splitlines(filename);
NRows = 1001;
NumBlocks = length(data)/NRows
Data_Reshaped = reshape(data,NRows,NumBlocks);

but it works partially, because I obtain a 1001 x 12 cell array and each cell has

Data1_1-Data1_2

with the - as delimiter. How can I split again this cell into two columns in ordered way and get a 1001x24 cell arrays? Than the result should be exported in excel but I guess writetable function should do the job.

Upvotes: 1

Views: 73

Answers (1)

Wolfie
Wolfie

Reputation: 30146

You need to do three things to make your data flat/uniform enough to put into Excel

  1. Split the single-header rows into two headers, the 2nd of which is just an empty string but you need two columns so you can stack with others properly

  2. Split the two-header row into two headers, using , as the delimiter

  3. Split the data rows into two, using - as the delimiter

I've created an example (at the bottom of this answer) with 3 data rows for testing:

First, read in the data

% Use fileread to get the entire file contents, split on newlines
dat = fileread( 'text.txt' );
dat = splitlines( dat );

% Define the schema for this data: 3 header rows then 3 data rows
nHeaders = 3;
nDataRows = 3;
nPerBlock = nHeaders + nDataRows;

% Do an initial reshape to get each "block" into its own column of the cell
dat = reshape( dat, nPerBlock, [] );

Now we can do the numbered operations above to get a cell array with two-columns per cell:

% Split single-item headers from 'Header1' into {'Header1',''}
dat([1,3],:) = cellfun( @(x) {x,''}, dat([1,3],:), 'uni', 0 );

% Split two-item headers from 'Header2 2_1,Header2 2_2' into {'Header2 2_1','Header2 2_2'}
dat(2,:) = cellfun( @(x) strsplit( x, ',' ), dat(2,:), 'uni', 0 );

% Split the data from 'Data1_1-Data1_2' into {'Data1_1','Data1_2'}
dat(nHeaders+1:end,:) = cellfun( @(x) strsplit( x, '-' ), dat(nHeaders+1:end,:), 'uni', 0 );

Now you need to do a final bit of restructuring to expand and flatten the cell array:

dat = arrayfun( @(r) [dat{r,:}], 1:size(dat,1), 'uni', 0 );
dat = vertcat( dat{:} );

Final output:

dat output

Since your data is a cell array rather than a table at this point (despite looking a bit like a table), you probably want to use writecell() to write into Excel or a csv.


Example data text.txt:

#Header1
#Header2 1_1,Header2 1_2
#Header3
Data1_1-Data1_2
Data2_1-Data2_2
Data3_1-Data3_2
#Header1
#Header2 2_1,Header2 2_2
#Header3
Data4_1-Data4_2
Data5_1-Data5_2
Data6_1-Data6_2
#Header1
#Header2 3_1,Header2 3_2
#Header3
Data7_1-Data7_2
Data8_1-Data8_2
Data9_1-Data9_2

Upvotes: 1

Related Questions