Fructibus
Fructibus

Reputation: 145

Grep's word boundaries include spaces?

I tried to use grep to search for lines containing the word "bead" using "\b" but it doesn't find the lines containing the word "bead" separated by space. I tried this script:

cat in.txt | grep -i "\bbead\b" > out.txt

I get results like

But I don't get the results like

Instead of getting some 2,000 lines, I'm only getting 92 lines

My OS is Windows 10 - 64 bit but I'm using grep 2.5.4 from the GnuWin32 package.

I've also tried the MSYS2, which includes grep 3.0 but it does the same thing.

And then, how can I search for words separated by space?

LATER EDIT: It looks like grep has problems with big files. My input file is 2.4 GB in size. With smaller files, it works - I reported the bug here: https://sourceforge.net/p/getgnuwin32/discussion/554300/thread/03a84e6b/

Upvotes: 0

Views: 461

Answers (2)

Ken Schumack
Ken Schumack

Reputation: 719

What you are doing normally should work but there are ways of setting what is and is not considered a word boundary. Rather than worry about it please try this instead:

cat in.txt | grep -iP "\bbead(\b|\s)" > out.txt

The P option adds in Perl regular expression power and the \s matches any sort of space character. The Or Bar | separates options within the parens ( )

While you are waiting for grep to be fixed you could use another tool if it is available to you. E.g.

perl -lane 'print if (m/\bbead\b/i);' in.txt > out.txt

Upvotes: 1

Sandeep Sukhija
Sandeep Sukhija

Reputation: 1176

Try this,

cat in.txt | grep -wi "bead" 

-w provides you a whole word search

Upvotes: 1

Related Questions