Reputation:
[email protected]>, size=35020827, class=-30, nrcpts=1, msgid=<2m96JLQblfm/fh.01u3YnFYK0bc3pmOExg2vA.totl.example.com>, proto=ESMTP, daemon=MTA-v6, relay=lemur.totl.example.com
[email protected]>, size=18071179, class=-30, nrcpts=1, msgid=<BhaYKoWuhDhrUQcT5.+tF6eKTCu0459KjSflNxLg.shoe-bags.example.com>, proto=ESMTP, daemon=MTA-v6, relay=dog.shoe-bags.example.com
[email protected]>, size=27057917, class=-30, nrcpts=1, msgid=<VaD1xW8SduAYImck.Mbx1MBcKTjBPlQpcaDhJRA.stellar-patrol.example.com>, proto=ESMTP, daemon=MTA-v6, relay=feinstein.stellar-patrol.example.com
[email protected]>, size=15212380, class=-30, nrcpts=1, msgid=<4wN8i90XT.BIdywWoKxNjeEM1q.planet-express.example.com>, proto=ESMTP, daemon=MTA-v6, relay=fry.planet-express.example.com
[email protected]>, size=44656174, class=-30, nrcpts=1, msgid=<1froj29vndf7h0.Qzoi+1hDEQOVp1frnQvWO.blackmesa.example.com>, proto=ESMTP, daemon=MTA-v6, relay=barney.blackmesa.example.com
[email protected]>, size=4556372, class=-30, nrcpts=1, msgid=<jnugzy+Z.L82rx1mhoSXi0RmK/yNP.stellar-patrol.example.com>, proto=ESMTP, daemon=MTA-v6, relay=feinstein.stellar-patrol.example.com
[email protected]>, size=35391498, class=-30, nrcpts=1, msgid=<fXr7+HM1U7ZpbJqxf.iJs6q9r.macrohard.example.com>, proto=ESMTP, daemon=MTA-v6, relay=corporate-mail-01.macrohard.example.com
[email protected]>, size=46296174, class=-30, nrcpts=1, msgid=<UJHE3Y4uEn.JBT3RESrNYL+fH5dFTGt5A.lawanda.example.com>, proto=ESMTP, daemon=MTA-v6, relay=achilles.lawanda.example.com
[email protected]>, size=12197030, class=-30, nrcpts=1, msgid=<gpq6lYSHHC67d.ZjyKUitfcPwOlA/OEc++.feddit.example.com>, proto=ESMTP, daemon=MTA-v6, relay=kittin.feddit.example.com
I wish to extract just the email address part of each line, for example [email protected]
I am currently using this technique:
cat file | grep -o 'user.*?com'
however since '.com' is at the end of the line occasionly i somehow still get the whole line returned.
my example output should look something like:
[email protected]
[email protected]
[email protected]
... etc
How would this be possible? many thanks for help
Upvotes: 2
Views: 345
Reputation: 46813
This should do:
grep -o 'user[^[:space:]]\+\.com' file
and observe I don't need a cat
here.
This uses the character class [:space:]
. What I'm saying is that I want everything that starts with user
, that ends with .com
and that contains only non-space characters (and at least one) in between ([^[:space:]]\+
).
Regarding your solution: you need the -P
switch for grep
to use Perl's regexp, so that .*?
is interprated as match anything, non-greedily:
grep -Po 'user.*?com' file
would work.
Now I hope you don't have any guests with email [email protected]
or similar, otherwise this one will fail here, as you'd obtain just user42@coolcom
:(
Parsing email addresses with a regex is not a simple task at all.
Upvotes: 2
Reputation: 10251
The .*? pattern only works if you give grep the -P option, which enables Perl-style regexps. Add that and it should work.
Upvotes: 0
Reputation: 12990
You could use awk to get parts of that line. In your case, it would be something like:
cat file | grep -o 'user.*?com' | awk -F',' '{print $1}'
For more functionality, you should check out the GNU Awk User Guide http://www.gnu.org/software/gawk/manual/gawk.html
Upvotes: 0