alzinos
alzinos

Reputation: 71

Pull specific cells out of a cell array by comparing the last digit of their filename

I have a cell array of filenames - things like '20160303_144045_4.dat', '20160303_144045_5.dat', which I need to separate into separate arrays by the last digit before the '.dat'; one cell array of '...4.dat's, one of '...5.dat's, etc.

My code is below; it uses regex to split the file around the '.dat', reshapes a bit then regexes again to pull out the last number of the filename, builds a cell to store the filenames in then, and then I get a tad stuck. I have an array produced such as '1,0,1,0,1,0..' of required cell indexes which I thought might be trivial to pull out, but I'm struggling to get it to do what I want.

numFiles = length(sampleFile); %sampleFile is the input cell array

splitFiles = regexp(sampleFile,'.dat','split');
column = vertcat(splitFiles{:});
column = column(:,1);

splitNums = regexp(column,'_','split');
splitNums = splitNums(:,1);
column = vertcat(splitNums{:});
column = column(:,3);

column = cellfun(@str2double,column); %produces column array of values - 3,4,3,4,3,4, etc

uniqueVals = unique(column);
numChannels = length(uniqueVals);


fileNameCell = cell(ceil(numFiles/numChannels),numChannels);

for i = 1:numChannels

   column(column ~= uniqueVals(i)) = 0;
   column = column / uniqueVals(i); %e.g. 1,0,1,0,1,0

   %fileNameCell(i) 
end

I feel there should be an easier way than my hodge-podge of code, and I don't want to throw together a ton of messy for-loops if I can avoid it; I definitely believe I've overcomplicated this problem massively.

Upvotes: 1

Views: 27

Answers (1)

Wolfie
Wolfie

Reputation: 30046

We can neaten your code quite a bit.

Take some example data:

files = {'abc4.dat';'abc5.dat';'def4.dat';'ghi4.dat';'abc6.dat';'def5.dat';'nonum.dat'};

You can get the final numbers using regexp and matching one or more digits followed by '.dat', then using strrep to remove the '.dat'.

filenums = cellfun(@(r) strrep(regexp(r, '\d+.dat', 'match', 'once'), '.dat', ''), ...
                   files, 'uniformoutput', false);

Now we can put these in a structure, using the unique numbers (prefixed by a letter because fields can't start with numbers) as field names.

% Get unique file numbers and set up the output struct
ufilenums = unique(filenums);
filestruct = struct;
% Loop over file numbers
for ii = 1:numel(ufilenums)
    % Get files which have this number
    idx = cellfun(@(r) strcmp(r, ufilenums{ii}), filenums);
    % Assign the identified files to their struct field
    filestruct.(['x' ufilenums{ii}]) = files(idx);
end

Now you have a neat output

% Files with numbers before .dat given a field in the output struct
filestruct.x4 = {'abc4.dat' 'def4.dat' 'ghi4.dat'}
filestruct.x5 = {'abc5.dat' 'def5.dat'}
filestruct.x6 = {'abc6.dat'}
% Files without numbers before .dat also captured
filestruct.x =  {'nonum.dat'}

Upvotes: 1

Related Questions