Chuck
Chuck

Reputation: 1293

Why doesn't grep work in pattern with colon

I know a colon: should be literal, so I'm not clear why a grep matches all lines. Here's a file called "test":

cat test
123|4444
4546|4444
666666|5678
7777777|7890675::1

I need to match the line with::1. Of course, the real case is more complicated, so I can't simply search for "::1". I tried many iterations, like

grep -E '^[0-9]|[0-9]:' test
grep -E '^[0-9]|[0-9]::1' test

But they return all lines:

123|4444
4546|4444
666666|5678
7777777|7890675::1

I am expecting to match just the last line. Any idea why that is?

This is GNU/Linux bash.

Upvotes: 3

Views: 7338

Answers (4)

glenn jackman
glenn jackman

Reputation: 247042

Another approach is to use a tool like awk that can process the fields of each line, and match lines where the 2nd field ends with "::1"

awk -F'|' '$2 ~ /::1$/' test

Upvotes: 0

dawg
dawg

Reputation: 104062

Given:

$ echo "$txt"
123|4444
4546|4444
666666|5678
7777777|7890675::1

Use repetition (+ means 'one or more') and character classes:

$ echo "$txt" | grep -E '^[[:digit:]]+[|][[:digit:]]+[:]+'
7777777|7890675::1

Since | is a regex meta character, it has to be either escaped (\|) or in a character class.

Upvotes: 2

Daniel H
Daniel H

Reputation: 7453

There are two issues:

  1. The regex [0-9] matches any single digit. Since you have multiple digits, you need to replace those parts with [0-9]+, which matches one or more digits. If you want to allow an empty sequence with no digits, replace the + with a *, which means “zero or more”.
  2. The pipe character | means “alternative”s in regex. What you provided will match either a digit at the start of the line, or a digit followed by a colon. Since every line has at least one of those, you match every line. To get a literal | character, you can use either [|] or \|; the second option is usually preferred in most styles.

Applying both of these, you get ^[0-9]+\|[0-9]+::1.

Upvotes: 1

Piotr Findeisen
Piotr Findeisen

Reputation: 20770

The pipe needs to be escaped and you need to allow repeated digits:

grep -E '^[0-9]+\|[0-9]+:' test

Otherwise ^[0-9] is all that needs to match for a line to be retained by the grep.

Upvotes: 4

Related Questions