user3064203
user3064203

Reputation: 101

Using regexp in matlab

I am trying to use the following code to count the number of the whole word "the" in a file. It keeps returning zero for the number of "the". How would I make this work?

totalthe=length(regexp(strcat(lines{:}),'\bthe\b'))

Upvotes: 1

Views: 1055

Answers (3)

Dennis Jaheruddin
Dennis Jaheruddin

Reputation: 21561

Here we go, based on the other answers, comments and some trial and error:

Suppose these are your lines:

lines = {'In the cell on the island'; 'there is the man.';'The end'}

Then this will count the occurance of 'the', case insensitive:

x = regexpi(lines,'\<the\>')
numel([x{:}])

Upvotes: 0

user2987828
user2987828

Reputation: 1137

Summarizing all comments:

totalthe=length(regexpi(strvcat(lines{:}),'\<the\>'))

strvcat instead of strcat to prevent a leading The will not be stuck to a word at end of previous line.

Upvotes: 0

MrAzzaman
MrAzzaman

Reputation: 4768

Sorry, turns out I may have led you astray in a previous answer. Turns out the word boundaries for MATLAB are \< and \> (for the start and ending word boundaries respectively) instead of \b. I learnt something new today too.

Note that this is preferable to using \s (whitespace), as otherwise you might miss matches at the start and end of the line.

Upvotes: 1

Related Questions