usual me
usual me

Reputation: 8778

How to use grep to extract multiple groups

Say I have this file data.txt:

a=0,b=3,c=5
a=2,b=0,c=4
a=3,b=6,c=7

I want to use grep to extract 2 columns corresponding to the values of a and c:

0 5
2 4
3 7

I know how to extract each column separately:

grep -oP 'a=\K([0-9]+)' data.txt
0
2
3

And:

grep -oP 'c=\K([0-9]+)' data.txt
5
4
7

But I can't figure how to extract the two groups. I tried the following, which didn't work:

grep -oP 'a=\K([0-9]+),.+c=\K([0-9]+)' data.txt
5
4
7

Upvotes: 8

Views: 11386

Answers (3)

Avinash Raj
Avinash Raj

Reputation: 174696

You could try the below grep command. But note that , grep would display each match in separate new line. So you won't get the format like you mentioned in the question.

$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file
0
5
2
4
3
7

To get the mentioned format , you need to pass the output of grep to paste or any other commands .

$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file | paste -d' ' - -
0 5
2 4
3 7

Upvotes: 9

fedorqui
fedorqui

Reputation: 289505

I am also curious about grep being able to do so. \K "removes" the previous content that is stored, so you cannot use it twice in the same expression: it will just show the last group. Hence, it should be done differently.

In the meanwhile, I would use sed:

sed -r 's/^a=([0-9]+).*c=([0-9]+)$/\1 \2/' file

it catches the digits after a= and c=, whenever this happens on lines starting with a= and not containing anything else after c=digits.

For your input, it returns:

0 5
2 4
3 7

Upvotes: 8

aelor
aelor

Reputation: 11116

use this :

awk -F[=,] '{print $2" "$6}' data.txt 

I am using the separators as = and ,, then spliting on them

Upvotes: 1

Related Questions