Greg Alexander
Greg Alexander

Reputation: 1222

parse text using grep regex pull out text from multiple lines of text in a file

I have a chunck of text in a file:

<tr bgcolor="#F9F9F9">
     <td align="left">8/7/2012 11:23:42 AM</td>
     <td align="left"><em>Here is the text I want to parse out</em></td>
     <td class="ra">9.00</td>
     <td class="ra">297.00</td>
     <td class="ra">0.00</td>
     <td class="ra">0.00</td>
     <td class="ra">$0.00</td>
     <td class="ra">$0.50</td>
     <td class="ra"></td>
 </tr>

using grep I would like to end up with the result being

Here is the text I want to parse out

Working on the code now I have

cat file.txt | grep -m 1 -oP '<em>[^</em>]*'

but that does not work... thanks for your help!

Upvotes: 1

Views: 2072

Answers (1)

Lev Levitsky
Lev Levitsky

Reputation: 65791

A correct regex would be (?<=<em>).*?(?=</em>).

So, try:

grep -m 1 -oP '(?<=<em>).*?(?=</em>)' file.txt

Upvotes: 4

Related Questions