Reputation: 347
I have a folder in which there are many files and I want to create a matrix that holds filenames with a specific pattern. For example: The folder contains files with names starting with a subject number (e.g. 03T1A.xxx.nii, 03T1A.yyy.nii) as well as filenames with specific patterns in the middle (e.g. 03T1A.c100.nii, 03T1A.c200.nii, 03T1A.c300.nii). In this specific case I am looking to extract all the filenames with the pattern c1 and c2 in the middle (e.g. 03T1A.c100.nii and 03T1A.c200.nii but not 03T1A.c300.nii).
To this point I have used the following code to create a pattern matching variable in 'pattern' which I would like to apply to the cell array of filenames I have extracted into the variable 'all_files' via the dir call.
func_path = char(strcat(input_dir, '/', subs(files), '/Func'));
pattern = 'c[12]*.nii'
all_files = dir(func_path);
all_files = {all_files.name};
I'd like to use (read. practice) regexp and doing it with string input seems easy but I am 100% stumped as to how to do it with cell input. I started trying to do something like this:
files = all_files(cellfun(@(x)regexp(x, pattern));
But it doesn't work, obviously. Could someone help me figure out what to do here if my ultimate goal is to get a matrix output with just the relevant filenames? I've been searching MATLAB answers and other Stack Overflow posts but part of my problem is I don't understand what's happening in their code snippets. I took the above line (or at the least the beginning of it) from another post but I don't know what, for example, 'x' is (an output variable?) or what's going on in the larger command such as
fin = cellfun(@(x)regexp(x, '\.', 'split'), res, 'UniformOutput', false)
Which I found in another thread. So basically, can someone help me figure out a command that will work while explaining it to me?
Upvotes: 1
Views: 985
Reputation: 65430
A couple of recommendations for doing this sort of thing
Do not use strcat
and '/'
characters to construct file paths. strcat
trims whitespace from all inputs prior to concatenation (filenames may have actual leading or trailing whitespace) and also rather than hard-coding a file path separator such as '/'
, use filesep
or better yet use fullfile
to construct the path to ensure that it will work on various platforms without problems.
func_path = fullfile(input_dir, subs(files), 'Func');
regexp
works directly on cell arrays therefore you can simply do:
all_files = dir(func_path);
% Search for the pattern in all filenames
matches = regexp({all_files.name}, pattern);
% Get the filenames of those that matched
all_files = {all_files(~cellfun('isempty', matches)).name};
Your pattern isn't matching any files because it currently would match only strings that contain a "c" with only zero or more 1's or 2's before the file extension. Instead, you'll want to use .*
to match anything between the "c1" or "c2" and the filename. Also you'll want to not use a *
after [12]
since that will actually match c3
since that has zero 1's or 2's. Also you'll want to escape the .
in .nii
so that it's not treated like a wildcard. For your pattern I would use something like
pattern = 'c[12].*\.nii';
If you really don't want to work with regular expressions, you could avoid all of this by simply using wildcards in your dir
call
c1_files = dir(fullfile(func_path, '*c1*.nii'));
c2_files = dir(fullfile(func_path, '*c2*.nii'));
Upvotes: 2