Eric
Eric

Reputation: 429

Matlab extract from a cell of strings by regexp

I have a cell of a couple of thousand strings that contain one or more underscores as shown below:

cel={'ABC_D234_567','ABD_X157_224','PT_D204_157','PT_O268_578','DTA_P2345','CBDRT_X345_D325};

I need to extract all the letters before the first underscore and one letter after; for example, 'ABC_D', 'PT_O', or 'CBDRT_X'.

I figured out a way to do so by using strfind, but it's several lines of code; finding indices for all underscores, using only the indices for the first underscores, then extracting strings from 1 to (index+1).

I'm pretty sure one can do this in one or fewer lines; something like:

cel_new = regexe(cel,'something something','once','match');

What would this 'something something' be?

Upvotes: 1

Views: 123

Answers (1)

FangQ
FangQ

Reputation: 1544

use cellfun to apply this operations to each element of a cell. like

cel={'ABC_D234_567','ABD_X157_224','PT_D204_157','PT_O268_578','DTA_P2345','CBDRT_X345_D325'};

cel_new=cellfun(@(x) regexprep(x,'^([A-Z]+_[A-Z]).*','$1','once'), cel,'uni',false)

regexprep helps you find and extract the pattern, and cellfun applies this to each string in the cell.

Upvotes: 1

Related Questions