readyblue
readyblue

Reputation: 81

How to regexp match surrounding whitespace or beginning/end of line

I am trying to find lines in a file that contain a / (slash) character which is not part of a word, like this:

grep "\</\>" file

But no luck, even if the file contains the "/" alone, grep does not find it.

I want to be able to match lines such as

some text / pictures
/ text
text /

but not e.g.

/home

Upvotes: 2

Views: 1472

Answers (3)

Sigi
Sigi

Reputation: 1864

Why your approach does not work

\<, \> only match against the beginning (or end, respectively) of a word. That means that they can never match if put adjacent to / (which is not treated as a word-character) – because e.g. \</ basically says "match the beginning of a word directly followed by something other than a word (a 'slash', in this case)", which is impossible.

What will work

This will match / surrounded by whitespace (\s) or beginning/end of line:

egrep '(^|\s)/($|\s)' file

(egrep implies the -E option, which turns on processing of extended regular expressions.)

What might also work

The following slightly simpler expression will work if a / is never adjacent to non-word characters (such as *, #, -, and characters outside the ASCII range); it might be of limited usefulness in OP's case:

grep '\B/\B' file

Upvotes: 4

mklement0
mklement0

Reputation: 440162

for str in  'some text / pictures' ' /home ' '/ text' ' text /'; do
  echo "$str" | egrep '(^|\s)/($|\s)'
done

This will match /:

  • if the entire input string is /
  • if the input string starts with / and is followed by at least 1 whitespace
  • if the input string ends with / and is preceded by at least 1 whitespace
  • if / is inside the input string surrounded by at least 1 whitespace on either side.

As for why grep "\</\>" file did not work:

\< and /> match the left/right boundaries between words and non-words. However, / does not qualify as a word, because words are defined as a sequence of one or more instances of characters from the set [[:alnum:]_], i.e.: sequences of at least length 1 composed entirely of letters, digits, or _.

Upvotes: 2

jordoncm
jordoncm

Reputation: 11

This seems to work for me.

grep -rni " / \| /\|/ " .

Upvotes: -3

Related Questions