Reputation: 1314
I was wondering if, with egrep ((GNU grep) 2.5.1), I can select a part of the matched text, something like:
grep '^([a-zA-Z.-]+)[0-9]+' ./file.txt
So I get only the part which matched, between the brackets, something like
house.com
Instead of the whole line like I usually get:
house.com112
Assuming I have a line with house.com112 in my file.txt.
(Actually this regular expression is just an example I just want to know if I can print only a part of the whole line.)
I do know in some languages, such as PHP, Perl or even AWK I can, but I do not know if I can with egrep.
Thank you in advance!
Upvotes: 6
Views: 11177
Reputation: 28549
Use lookahead of regular
$ echo 'house.com112' | grep -Po '([a-zA-Z.]+)(?=\d+)'
house.com
Upvotes: 2
Reputation: 12613
The first part of your regex is more general than the second half, and since + is greedy, the second [0-9]+ will never match anything only match the last digit (thanks Paul). If you can make your first half more specific (e.g. if you know it will end in a TLD) you could do it.
There's an amazingly cool tool called ack which is basically grep with perl regexs. I'm not sure if it's possible to use in your case, but if you can do what you want in perl, you can do it with ack.
Edit:
Why not just drop the end of the regex? Are there false positives if you do that? If you, you could pipe the results to egrep again with the first half of the regex only.
This seems to be what you are asking about: Also, on the off chance that you don't know about it, the -o flag will output only the matched portion of a given line.
Upvotes: 3
Reputation: 342609
you might want to try the -o, -w flags in grep. egrep is "deprecated" , so use grep -E
.
$ echo "test house.com house.com112"| grep -Eow "house.com"
house.com
The basic idea is to go through each word and test for equality.
$ echo "test house.com house.com112"| awk '{for(i=1;i<=NF;i++){ if($i=="house.com") print $i}}'
house.com
Upvotes: 3
Reputation: 838666
Use sed
to modify the result after grep has found the lines that match:
grep '^[a-zA-Z.-]+[0-9]+' ./file.txt | sed 's/[0-9]\+$//'
Or if you want to stick with only grep, you can use grep with the -o switch instead of sed:
grep '^[a-zA-Z.-]+[0-9]+' ./file.txt | grep -o '[a-zA-Z.-]+'
Upvotes: 11