Reputation: 2807
I have a file of about 150 lines, where each line is part of a URL. I wanted to extract 4 different parameters from each of the lines and put them into a file. Something like:
/secure/domain/new.aspx?id=620&utm_source=1034&utm_medium=cpc&utm_term=term1&try=1&v=3&utm_account=account_name&utm_campaign=campaign_name&utm_adgroup=adgroup&keyword=keyword1&pkw=pkw1&idimp=id&premt=premt1&gclid=id
As a trial, I did
awk '/pkw/,/&idimp/' file > output.txt
thinking that this would atleast get me value1, but it just returned the input file as is. What am I doing wrong? Also, how to make it return all four values? I'm looking to get keyword, pkw, idimp and premt.
Edit: The expected output is a file containing the 4 values for each of the 150 lines in the input file. So
keyword pkw1 idi premt1
Even if I just get the 4 values in 4 different files, it would suffice.
Upvotes: 0
Views: 6509
Reputation: 247042
s='/helloworld/some/other/standard/URL/mumbo/jumbo/page.aspx?strings&that&I&am¬&interested&in¶m1=value1¶m2=value2¶m3=value3¶m4=value4&some&more&uninteresting&strings'
echo "$s" | grep -o 'param[1234]=[^&]*' | cut -d= -f2- | paste -d " " - - - -
value1 value2 value3 value4
Keeping up with the clarifications to the question:
s='/secure/domain/new.aspx?id=620&utm_source=1034&utm_medium=cpc&utm_term=term1&try=1&v=3&utm_account=account_name&utm_campaign=campaign_name&utm_adgroup=adgroup&keyword=keyword&pkw=pkw1&idimp=id&premt=premt1&gclid=id'
echo "$s" | grep -o '\<\(keyword\|pkw\|idimp\|premt\)=[^&]*' | cut -d= -f2- | paste -d " " - - - -
keyword pkw1 id premt1
The \<
is a "start of word" anchor to avoid matching parameters like "fookeyword"
With awk, I'd write:
awk -F '[?=&]' '
BEGIN {
# initialize the parameters you want
p["keyword"] = p["pkw"] = p["idimp"] = p["premt"] = 1
}
{
for (i=2; i<NF; i+=2)
if ($i in p)
printf "%s ", $(i+1)
print ""
}
'
Upvotes: 1
Reputation: 785651
You can use this awk:
awk -F'[=&]' '{print $2, $4, $6, $8}' file
value1 value2 value3 value4
To redirect the output to a file:
awk -F'[=&]' '{print $2, $4, $6, $8}' file > output.txt
EDIT: Based on your edited question you can use:
awk -F'[=&]' '{n=1; for (i=1; i<=NF; i++) {if ($i=="interested") {n=i+3; break}}
for (i=0; i<8; i+=2) printf $(n+i) " "; print ""}' file
value1 value2 value3 value4
Upvotes: 1
Reputation: 189739
Or just grep -P
, but that probably requires installing GNU grep.
grep -oP '[?&][^&?=]+=\K[^&?]+'
Upvotes: 0