Elroy Jetson
Elroy Jetson

Reputation: 968

Grep with reg ex

Trying to use regex with grep in the command line to give me lines that start with either a whitespace or lowercase int followed by a space. From there, they must end with either a semi colon or a o.

I tried

grep ^[\s\|int]\s+[\;\o]$ fileName

but I don't get what I'm looking for. I also tried

grep ^\s*int\s+([a-z][a-zA-Z]*,\s*)*[a-z]A-Z]*\s*;

but nothing.

Upvotes: 0

Views: 193

Answers (1)

John1024
John1024

Reputation: 113824

Let's consider this test file:

$ cat file
 keep marco
polo
int keep;
int x

If I understand your rules correctly, two of the lines in the above should be kept and the other two discarded.

Let's try grep:

$ grep -E '^(\s|int\s).*[;o]$' file
 keep marco
int keep;

The above uses \s to mean space. \s is supported by GNU grep. For other greps, we can use a POSIX character class instead. After reorganizing the code slightly to reduce typing:

grep -E '^(|int)[[:blank:]].*[;o]$' file

How it works

In a Unix shell, the single quotes in the command are critical: they stop the shell from interpreting or expanding any character inside the single quotes.

-E tells grep to use extended regular expressions. Thus reduces the need for backslashes.

Let's examine the regular expression, one piece at a time:

  1. ^ matches at the beginning of a line.

  2. (\s|int\s) This matches either a space or int followed by a space.

  3. .* matches zero or more of any character.

  4. [;o] matches any character in the square brackets which means that it matches either ; or o.

  5. $ matches at the end of a line.

Upvotes: 3

Related Questions