innervoice
innervoice

Reputation: 118

How to grep particular lines

I am trying to fetch Some IDs from URL.

In my script I hit the URL using while loop and wget command and I save output in file.

Then in same loop I grep XYZ User ID: and 3 lines after this string and save it to another file.

When I open this output file I find following lines.

< p >XYZ User ID:< /p>

< /td >

< td>

< p>2989288174< /p>

So using grep or any thing else how can I print following output

XYZ User ID:2989288174

Upvotes: 2

Views: 202

Answers (2)

Jahid
Jahid

Reputation: 22428

This should work (sed with extended regex):

sed -nr 's#<\s*p\s*>([^>]*)<\s*/\s*p\s*>#\1#p' file | tr -d '\n'

Output:

XYZ User ID:2989288174

Upvotes: 1

Juan Diego Godoy Robles
Juan Diego Godoy Robles

Reputation: 14955

Supposing a constant tag pattern:

<p>XYZ User ID:</p>
</td>
<td>
<p>2989288174</p>

grep should be the best way:

grep -oP '(?<=p>)([^>]+?)(?=<\/p)' outputfile|while read user;do
  read id
  echo "$user $id"
done

Note that look-behind expressions cannot be of variable length. That means you cannot use quantifiers ?, *, + , etc or alternation of different-length items inside them.

For variable length tags awk could be well suited for oneliner tags:

awk '/User ID/{print ""}/p *>/{printf $3}' FS='(p *>|<)' outputfile

Upvotes: 3

Related Questions