Daniel
Daniel

Reputation: 21

formatting arrays with numbers and characters

I need help turning a Decay.txt file into an array, the first 1-3 and 5th columns are numbers, the third column is "time elapsed" in integers, but the 4th column is a unit of time (milliseconds, Months, Days) but its spelled out with characters. i cant get this mixed array (numbers and characters) to transfer over to matlab

ideally id like to take the unit of time (4th column) change it to a seconds value, (i.e. hour becomes 3600 seconds) then multiply it by the number in the third column and have a final 4 column array where the 3rd column is simply the time elapsed in seconds

anyone know how to do either of these things?

ive tried

Decay = fopen('Decay.txt','r');
B = fscanf(Decay,'%f',[5 inf]);

which stops and has an error as soon as it hits the 4th column

and

Decay = fopen('Decay.txt','r');
B = fscanf(Decay,'%s',[5 inf]);

but this just creates a 5x10000 column where every single number, decimal, and letter is on its own in its own cell of the array

Upvotes: 0

Views: 87

Answers (1)

Joel Filho
Joel Filho

Reputation: 1300

Your first example

Decay = fopen('Decay.txt','r');
B = fscanf(Decay,'%f',[5 inf]);

Breaks because it can't scan the fourth column (a string) as a number (%f). Your second example doesn't have numbers because you're scanning everything as a string (%s).

The correct specifier for your format should be

'%f %f %f %s %f'

However, if you call fscanfwith it, as per documentation:

If formatSpec contains a combination of numeric and character specifiers, then A is numeric, of class double, and fscanf converts each text characters to its numeric equivalent. This occurs even when formatSpec explicitly skips all numeric fields (for example, formatSpec is '%*d %s').

So this input file:

50    1.2   99    s   0
6.42  1.2   3.11  min 1
22    37    0.01  h   2

Has this (undesired) output:

>> fscanf(Decay, "%f %f %f %s %f", [5, inf])

ans =

   50.0000    6.4200  110.0000  104.0000
    1.2000    1.2000    1.0000    2.0000
   99.0000    3.1100   22.0000         0
  115.0000  109.0000   37.0000         0
         0  105.0000    0.0100         0

That happens because a matrix in MATLAB can't have multiple data of different types. So, your best bet is scanning into a cell array, which can have any type inside.

B = textscan(Decay, "%f %f %f %s %f")

Returns a cell array with the appropriate types. You can use this output to convert the time data into the same unit and build your vectors/matrix. Columns 1, 2, 3 and 5 are trivial to do, just by accessing the cell B{n} for each n.

Column 4 is a cell array of cells. In each internal cell, there's the string you have. You need to apply a conversion from string to the number you need. For my example, such function would look like:

function scale = DecayScale(unit)
    switch(unit)
        case 's'
            scale = 1;
        case 'min'
            scale = 60;
        case 'h'
            scale = 3600;
        otherwise
            throw('Number format not recognized');
    end
end

Which you could then apply to the 4th column like:

timeScale = cellfun(@DecayScale, B{4})

And get the final time as:

timeColumn = B{3} .* timeScale

Upvotes: 2

Related Questions