kenorb
kenorb

Reputation: 166319

How to grep and match the first occurrence of a line?

Given the following content:

title="Bar=1; Fizz=2; Foo_Bar=3;"

I'd like to match the first occurrence of Bar value which is 1. Also I don't want to rely on soundings of the word (like double quote in the front), because the pattern could be in the middle of the line.

Here is my attempt:

$ grep -o -m1 'Bar=[ ./0-9a-zA-Z_-]\+' input.txt
Bar=1
Bar=3

I've used -m/--max-count which suppose to stop reading the file after num matches, but it didn't work. Why this option doesn't work as expected?

I could mix with head -n1, but I wondering if it is possible to achieve that with grep?

Upvotes: 4

Views: 6076

Answers (4)

anubhava
anubhava

Reputation: 784898

Using perl based regex flavor in gnu grep you can use:

grep -oP '^(.(?!Bar=\d+))*Bar=\d+' <<< "Bar=1; Fizz=2; Foo_Bar=3;"
Bar=1

(.(?!Bar=\d+))* will match 0 or more of any characters that don't have Bar=\d+ pattern thus making sure we match first Bar=\d+

If intent is to just print the value after = then use:

grep -oP '^(.(?!Bar=\d+))*Bar=\K\d+' <<< "Bar=1; Fizz=2; Foo_Bar=3;"
1

Upvotes: 3

Walter A
Walter A

Reputation: 19982

First use a grep to make the line start with Bar, and then get the Bar at the start of the line:

grep -o "Bar=.*" input.txt | grep -o -m1 "^Bar=[ ./0-9a-zA-Z_-]\+"

When you have a large file, you can optimize with

grep -o -m1 "Bar=.*" input.txt | grep -o -m1 "^Bar=[ ./0-9a-zA-Z_-]\+"

Upvotes: 2

mklement0
mklement0

Reputation: 437080

grep is line-oriented, so it apparently counts matches in terms of lines when using -m[1] - even if multiple matches are found on the line (and are output individually with -o).

While I wouldn't know to solve the problem with grep alone (except with GNU grep's -P option - see anubhava's helpful answer), awk can do it (in a portable manner):

$ awk -F'Bar=|;' '{ print $2 }' <<<"Bar=1; Fizz=2; Foo_Bar=3;"
1

Use print "Bar=" $2, if the field name should be included.
Also note that the <<< method of providing input via stdin (a so-called here-string) is specific to Bash, Ksh, Zsh; if POSIX compliance is a must, use echo "..." | grep ... instead.


[1] Options -m and -o are not part of the grep POSIX spec., but both GNU and BSD/OSX grep support them and have chosen to implement the line-based logic.
This is consistent with the standard -c option, which counts "selected lines", i.e., the number of matching lines:
grep -o -c 'Bar=[ ./0-9a-zA-Z_-]\+' <<<"Bar=1; Fizz=2; Foo_Bar=3;" yields 1.

Upvotes: 2

Krzysztof Krasoń
Krzysztof Krasoń

Reputation: 27466

You can use grep -P (assuming you are on gnu grep) and positive look ahead ((?=.*Bar)) to achieve that in grep:

echo "Bar=1; Fizz=2; Foo_Bar=3;" | grep -oP -m 1 'Bar=[ ./0-9a-zA-Z_-]+(?=.*Bar)'

Upvotes: 1

Related Questions