user170579
user170579

Reputation: 8530

Extract a pattern from the output of curl

I would like to use curl, on the command line, to grab a url, pipe it to a pattern, and return a list of urls that match that pattern.

I am running into problems with greedy aspects of the pattern, and can not seem to get past it. Any help on this would be apprecaited.

curl http://www.reddit.com/r/pics/ | grep -ioE "http://imgur\.com/.+(jpg|jpeg|gif|png)"

So, grab the data from the url, which returns a mess of html, which may need some linebreaks somehow replaced in, onless the regex can return more than one pattern in a single line. The patter is pretty simple, any string that matches...

Thats about it, at that url, with default settings, I should generally get back a good set of images. I would not be objectionable to using the RSS feel url for the same page, it may be easier to parse actually.

Thanks everyone!

Edit Thanks for a quick answer, my final command is now:

$curl -s http://www.reddit.com/r/pics/ | grep -ioE "http:\/\/imgur\.com\/.{1,10}\.(jpg|jpeg|gif|png)"

Upvotes: 5

Views: 15059

Answers (2)

MB9000
MB9000

Reputation: 1

Cool. Grep your WAN IP from URL:

curl -s https://hostpapastatus.com/ip/ | grep -ioE "([0-9]{1,3}[\.]){3}[0-9]{1,3}"

Upvotes: 0

Ben
Ben

Reputation: 16553

Try:

http:\/\/imgur\.com\/.{5,8}\.(jpg|jpeg|gif|png)

Upvotes: 3

Related Questions