gkmohit
gkmohit

Reputation: 710

grep for a specific pattern in a file?

I have a file textFile.txt

[email protected]
abc_aer@
@avret
[email protected]
[email protected]
qwe.caer

I want to grep to get specific lines :

[email protected]
[email protected]
[email protected]

That is the ones that have

[a-z]_[a-z]@[a-z].[a-z]

but the part before the @ can have any number of "_"

So far this is what I have :

grep "[a-z]_[a-z]@[a-z].[a-z]" textFile.txt

But I got only one line as the output.

[email protected]

Could I know a better way to do this ? :)

Upvotes: 1

Views: 148

Answers (5)

Jotne
Jotne

Reputation: 41446

This regex should get all valid email from a text file:

grep -E -o "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" file
[email protected]
[email protected]
[email protected]

This greps for pattern like this [email protected]_more_text

Upvotes: 0

Tsa579
Tsa579

Reputation: 11

When you do [a-z] it only matches one character of that set. That's why you are only getting [email protected] back from your grep call because there is only one character between the _ and the @.

To match more than one character, you can use + or *. + means one or more of the set and * any number of that set. As well, an unescaped . means any character.

So something like:

grep "[a-z]\+_[a-z]\+@[a-z]\+\.[a-z]\+" textFile.txt would work for this. There are shorter, less specific ways of doing this as well (that other answers have shown).

Note the escapes before the + signs and the . .

Upvotes: 0

John1024
John1024

Reputation: 113814

The following selects lines that have at least one underline character followed by letters before the at-sign and one or more letters followed by at least one literal period after the at-sign:

$ grep '_[a-z]\+@[a-z]\+\.' textFile.txt
[email protected]
[email protected]
[email protected]

Notes

  • An unescaped period matches any character. If you want to match a literal period, it must be escaped like '.`.

    Thus, @[a-z].[a-z] matches an at-sign, followed by a letter, followed by anything at all, followed by a letter.

  • [a-z] matches a single letter. Thus _[a-z]@ would match only if there was only one character between the underline and the at-sign. To match one or more letters, use [a-z]\+.

    @[a-z]\+\. will match an at-sign, followed by one or more letters, followed by a literal period character.

Upvotes: 0

anubhava
anubhava

Reputation: 784898

I would suggest keeping it simple by checking only one @ is present in each line:

grep -E '^[^@]+@[^@]+$' file
[email protected]
[email protected]
[email protected]

Upvotes: 0

peter
peter

Reputation: 15079

you can add the _ simply inside [a-z_] so the new command is:

grep "[a-z_]@[a-z].[a-z]" textFile.txt

or if you want it to start with a non _ you can have

grep "[a-z][a-z_]@[a-z].[a-z]" textFile.txt

Upvotes: 1

Related Questions