Reputation: 157
Hey i'm using a combination of sed and curl to extract some text from the webpage example.com
here is my code
curl -s http://example.com | sed -n -e 's/.*<h1>\(.*\)<\/h1>.*<p>\(This.*\)<\/p>/\1 \n \2/p'
however, I don't get any output. What could I be doing wrong?
Upvotes: 0
Views: 423
Reputation: 4813
Although sed is generally not the right tool for extracting text from web pages it may be sufficent for simple tasks. sed is a line oriented tool. So each line will be handled separately.
If you really want to do it with sed, this will you give some output:
curl -s http://example.com | sed -n -e 's/.*<h1>\(.*\)<\/h1>/\1 \n/p' -e 's/<p>\(This.*\)/\1 \n/p'
Upvotes: 1