Reputation:

Bash grep stopping at first match

[email protected]>, size=35020827, class=-30, nrcpts=1, msgid=<2m96JLQblfm/fh.01u3YnFYK0bc3pmOExg2vA.totl.example.com>, proto=ESMTP, daemon=MTA-v6, relay=lemur.totl.example.com
[email protected]>, size=18071179, class=-30, nrcpts=1, msgid=<BhaYKoWuhDhrUQcT5.+tF6eKTCu0459KjSflNxLg.shoe-bags.example.com>, proto=ESMTP, daemon=MTA-v6, relay=dog.shoe-bags.example.com
[email protected]>, size=27057917, class=-30, nrcpts=1, msgid=<VaD1xW8SduAYImck.Mbx1MBcKTjBPlQpcaDhJRA.stellar-patrol.example.com>, proto=ESMTP, daemon=MTA-v6, relay=feinstein.stellar-patrol.example.com
[email protected]>, size=15212380, class=-30, nrcpts=1, msgid=<4wN8i90XT.BIdywWoKxNjeEM1q.planet-express.example.com>, proto=ESMTP, daemon=MTA-v6, relay=fry.planet-express.example.com
[email protected]>, size=44656174, class=-30, nrcpts=1, msgid=<1froj29vndf7h0.Qzoi+1hDEQOVp1frnQvWO.blackmesa.example.com>, proto=ESMTP, daemon=MTA-v6, relay=barney.blackmesa.example.com
[email protected]>, size=4556372, class=-30, nrcpts=1, msgid=<jnugzy+Z.L82rx1mhoSXi0RmK/yNP.stellar-patrol.example.com>, proto=ESMTP, daemon=MTA-v6, relay=feinstein.stellar-patrol.example.com
[email protected]>, size=35391498, class=-30, nrcpts=1, msgid=<fXr7+HM1U7ZpbJqxf.iJs6q9r.macrohard.example.com>, proto=ESMTP, daemon=MTA-v6, relay=corporate-mail-01.macrohard.example.com
[email protected]>, size=46296174, class=-30, nrcpts=1, msgid=<UJHE3Y4uEn.JBT3RESrNYL+fH5dFTGt5A.lawanda.example.com>, proto=ESMTP, daemon=MTA-v6, relay=achilles.lawanda.example.com
[email protected]>, size=12197030, class=-30, nrcpts=1, msgid=<gpq6lYSHHC67d.ZjyKUitfcPwOlA/OEc++.feddit.example.com>, proto=ESMTP, daemon=MTA-v6, relay=kittin.feddit.example.com

I wish to extract just the email address part of each line, for example [email protected]

I am currently using this technique:

cat file | grep -o 'user.*?com'

however since '.com' is at the end of the line occasionly i somehow still get the whole line returned.

my example output should look something like:

[email protected]
[email protected]
[email protected]
... etc

How would this be possible? many thanks for help

Upvotes: 2

Answers (3)

gniourf_gniourf

Reputation: 46813

This should do:

grep -o 'user[^[:space:]]\+\.com' file

and observe I don't need a cat here.

This uses the character class [:space:]. What I'm saying is that I want everything that starts with user, that ends with .com and that contains only non-space characters (and at least one) in between ([^[:space:]]\+).

Regarding your solution: you need the -P switch for grep to use Perl's regexp, so that .*? is interprated as match anything, non-greedily:

grep -Po 'user.*?com' file

would work.

Now I hope you don't have any guests with email [email protected] or similar, otherwise this one will fail here, as you'd obtain just user42@coolcom :(

Parsing email addresses with a regex is not a simple task at all.

Upvotes: 2

Mark Plotnick

Reputation: 10251

The .*? pattern only works if you give grep the -P option, which enables Perl-style regexps. Add that and it should work.

Upvotes: 0

slider

Reputation: 12990

You could use awk to get parts of that line. In your case, it would be something like:

cat file | grep -o 'user.*?com' | awk -F',' '{print $1}'

For more functionality, you should check out the GNU Awk User Guide http://www.gnu.org/software/gawk/manual/gawk.html

Upvotes: 0

Bash grep stopping at first match

Answers (3)

Related Questions