Reputation: 457
I have the following line. I can grep one part but struggling with also grepping the second portion.
Line:
html:<TR><TD>PICK_1</TD><TD>36.0000</TD><TD>1000000</TD><TD>26965</TD><TD>100000000</TD><TD>97074000</TD><TD>2926000</TD><TD>2.926%</TD><TD>97.074%</TD></TR>
I want to have the following results after grepping this line.
PICK_1 97.074%
Currently just grepping first portion via following command.
grep -Po "<TR><TD>[A-Z0-9_]+" test.txt
Appreciate any help on how I can go about doing this. Thanks.
Upvotes: 1
Views: 127
Reputation: 2091
If you always have the same number of fields delimited by "TD" tags, you can try with this (dirty) awk
:
awk -F'[<TD>|</TD>]' '{print $8 " " $80}'
Or this combination of column
and awk
:
column -t -s "</TD>" | awk -F' ' '{print $3 " " $11}'
Or with sed
instead of column
:
sed -e 's/<TD>/ /g' | awk -F' ' '{print $3 " " $11}'
Upvotes: 1
Reputation: 74596
Use awk with a custom field separator:
awk -F'[<>TDR/]+' '{ print $2, $(NF-1) }' file
This splits the line on things that look like one or more opening or closing <TD>
or <TR>
tags, and prints the second and second-last field.
Warning: this will break on almost every input except the one that you've shown, since awk, grep and friends are designed for processing text, not HTML.
Upvotes: 2
Reputation: 334
try provide each patter after "-e" option
grep -e PICK_1 -e "<TR><TD>[A-Z0-9_]+" test.txt
Upvotes: 0