Abhishek Bhatia
Abhishek Bhatia

Reputation: 9806

Extract values from filenames

I have file names stored as follows:

>> allFiles.name

ans =

k-120_knt-500_threshold-0.3_percent-34.57.csv


ans =

k-216_knt-22625_threshold-0.3_percent-33.33.csv

I wish to extract the 4 values from them and store in a cell.

data={};
for k =1:numel(allFiles)
    data{k,1}=csvread(allFiles(k).name,1,0);
    data{k,2}= %kvalue
    data{k,3}= %kntvalue
    data{k,4}=%threshold
    data{k,5}=%percent
    ...
 end

Upvotes: 0

Views: 66

Answers (3)

Otto Nahmee
Otto Nahmee

Reputation: 133

You simply need to tokenize using strtok multiple times (there is more than 1 way to solve this). Someone has a handy matlab script somewhere on the web to tokenize strings into a cell array.

(1) Starting with:

filename = 'k-216_knt-22625_threshold-0.3_percent-33.33.csv'

Use strfind to prune out the extension

r = strfind(filename, '.csv')
filenameWithoutExtension = filename(1:r-1)

This leaves us with:

'k-216_knt-22625_threshold-0.3_percent-33.33'

(2) Then tokenize this:

'k-216_knt-22625_threshold-0.3_percent-33.33'

using '_' . You get the tokens:

'k-216'
'knt-22625'
'threshold-0.3'
'percent-33.33'

(3) Lastly, for each string, tokenize using using '-'. Each second string will be:

'216'
'22625'
'0.3'
'33.33'

And use str2num to convert.

Upvotes: 1

TimeString
TimeString

Reputation: 1798

Strategy: strsplit() + str2num().

data={};
for k =1:numel(allFiles)
    data{k,1}=csvread(allFiles(k).name,1,0);
    words = strsplit( allFiles(k).name(1:(end-4)), '_' );
    data{k,2} = str2num(words{1}(2:end));
    data{k,3} = str2num(words{2}(4:end));
    data{k,4} = str2num(words{3}(10:end));
    data{k,5} = str2num(words{4}(8:end));
end

Upvotes: -1

Phil Goddard
Phil Goddard

Reputation: 10762

There's probably a regular expression that can be used to do this, but a simple piece of code would be

data={numel(allFiles),5};
for k =1:numel(allFiles)
    data{k,1}=csvread(allFiles(k).name,1,0);
    [~,name] = fileparts(allFiles(k).name);
    dashIdx = strfind(name,'-'); % find location of dashes
    usIdx = strfind(name,'_'); % find location of underscores
    data{k,2}= str2double(name(dashIdx(1)+1:usIdx(1)-1)); %kvalue
    data{k,3}= str2double(name(dashIdx(2)+1:usIdx(2)-1)); %kntvalue
    data{k,4}= str2double(name(dashIdx(3)+1:usIdx(3)-1)); %threshold
    data{k,5}= str2double(name(dashIdx(4)+1:end)); %percent
    ...
end

For efficiency, you might consider using a single matrix to store all the numeric data, and/or a structure (so that you can access the data by name rather than index).

Upvotes: 1

Related Questions