grep text before string - regex

Question

I have to extract few fields from below input html text using bash (only).

HTML input

SOMETEXT

I have extract id value and SOMETEXT from above input.

I am hoping that grep using some regex should workout. For id_value I am using following regex

"id=[0-9]*"

which is giving me correct results.

grep -o 'id=[0-9]*' index.html | head -n 5

But I am not sure what sort of regex I should use to grab text till next .

Thanks in advance.

Tim Biegeleisen · Accepted Answer

The regex you have in your OP ("id=[0-9]*") looks like it worked in your case, but a better approach is to hone down on the anchor tags themselves.

Here is a regex to extract out the id value:

And here is a regex to extract out the contents inside the tag:

(.*?)<\/a>

Answers (2)