Extract/Scrape values from HTML

Question

I'd like to create a script that grabs two values from this awful HTML published on a city website:

558.35

and

66.0

These are water reservoir details and change weekly.

I'm unsure what the best tool to do this is, grep?

Thanks for your suggestions, ideas!


    
        
             Currently:
                                    558.35
        
        
             Percent of capacity:
                                     66.0%

repzero · Accepted Answer

if you are using regex you can use sed

sed -nr 's#^[ ]*.*;[ ]?([0-9]+[.][0-9]+)[%]?[ ]*$#\1#p' my_html_file

An Htmlparser such as python's module BeautifulSoup or a javascript approach is a safer choice

EDIT:

Here is a snippet using javascript..results is logged to the console and an alert box pops up to show results

var values="";
for(i=1;i



    
        
             Currently:
                                    558.35
        
        
             Percent of capacity:
                                     66.0%

Extract/Scrape values from HTML

Answers (1)

Related Questions