bash curl | grep a specific website content

Question

I'm trying to extract a specific piece of information from a website, but the content seems to be included in the class definition:

I'm targeting the "999", which I can if I do:

curl -s url |grep -zPo '\s*\K.*?(?=\s*)'

If the "999" is in the content though, and it changes, grep would become invalid. Wildcards wouldn't return anything

Reino · Accepted Answer

Please(!) have a look at the following urls before you attempt to parse a website with RegEx:

With an HTML/XML parser like xidel it's as simple as:

xidel -s "" -e '//div[@class="some_div_class"]/strong/@content'

or

xidel -s "" -e '//div[@class="some_div_class"]/normalize-space(strong)'

Answers (1)