Reputation: 355
I want to extract the numbers following client_id and id and pair up client_id and id in each line.
For example, for the following lines of log,
User(client_id:03)) results:[RelatedUser(id:204, weight:10),_RelatedUser(id:491,_weight:10),_RelatedUser(id:29, weight: 20)
User(client_id:04)) results:[RelatedUser(id:209, weight:10),_RelatedUser(id:301,_weight:10)
User(client_id:05)) results:[RelatedUser(id:20, weight: 10)
I want to output
03 204
03 491
03 29
04 209
04 301
05 20
I know I need to use sed or awk. But I do not know exactly how.
Thanks
Upvotes: 7
Views: 3169
Reputation: 58558
This might work for you (GNU sed):
sed -r '/.*(\(client_id:([0-9]+))[^(]*\(id:([0-9]+)/!d;s//\2 \3\n\1/;P;D' file
/.*(\(client_id:([0-9]+))[^(]*\(id:([0-9]+)/!d
if the line doesn't have the intended strings delete it.s//\2 \3\n\1/
re-arrange the line by copying the client_id
and moving the first id
ahead thus reducing the line for successive iterations.P
print upto the introduced newline.D
delete upto the introduced newline.Upvotes: 3
Reputation: 47219
I would prefer awk for this, but if you were wondering how to do this with sed, here's one way that works with GNU sed.
parse.sed
/client_id/ {
:a
s/(client_id:([0-9]+))[^(]+\(id:([0-9]+)([^\n]+)(.*)/\1 \4\5\n\2 \3/
ta
s/^[^\n]+\n//
}
Run it like this:
sed -rf parse.sed infile
Or as a one-liner:
<infile sed '/client_id/ { :a; s/(client_id:([0-9]+))[^(]+\(id:([0-9]+)([^\n]+)(.*)/\1 \4\5\n\2 \3/; ta; s/^[^\n]+\n//; }'
Output:
03 204
03 491
03 29
04 209
04 301
05 20
The idea is to repeatedly match client_id:([0-9]+)
and id:([0-9]+)
pairs and put them at the end of pattern space. On each pass the id:([0-9]+)
is removed.
The final replace removes left-overs from the loop.
Upvotes: 2
Reputation: 47367
Here's a awk
script that works (I put it on multiple lines and made it a bit more verbose so you can see what's going on):
#!/bin/bash
awk 'BEGIN{FS="[\(\):,]"}
/client_id/ {
cid="no_client_id"
for (i=1; i<NF; i++) {
if ($i == "client_id") {
cid = $(i+1)
} else if ($i == "id") {
id = $(i+1);
print cid OFS id;
}
}
}' input_file_name
Output:
03 204
03 491
03 29
04 209
04 301
05 20
Explanation:
awk 'BEGIN{FS="[\(\):,]"}
: invoke awk
, use (
)
:
and ,
as delimiters to separate your fields/client_id/ {
: Only do the following for the lines that contain client_id
:for (i=1; i<NF; i++) {
: iterate through the fields on each line one field at a timeif ($i == "client_id") { cid = $(i+1) }
: if the field we are currently on is client_id
, then its value is the next field in order.else if ($i == "id") { id = $(i+1); print cid OFS id;}
: otherwise if the field we are currently on is id
, then print the client_id : id
pair onto stdout
input_file_name
: supply the name of your input file as first argument to the awk
script.Upvotes: 4
Reputation: 54572
This may work for you:
awk -F "[):,]" '{ for (i=2; i<=NF; i++) if ($i ~ /id/) print $2, $(i+1) }' file
Results:
03 204
03 491
03 29
04 209
04 301
05 20
Upvotes: 5