Clayton Dukes
Clayton Dukes

Reputation: 1317

Python one-liner for extracting all URLs

I need a python one-liner that will return all URLs found in a string and put it into a bash array. Something like:

URLs=($(echo 'foo bar baz http://blackfridaygift.info/BUu4nmkRR baz foo bar http://inhelation.com/fil/iowa/lvmk65irqibpmi972hz6xx2k.php%3FLA4i9C1606274697520fdd2ad4cf649e25ae72996c901bf1520fdd2ad4cf649e25ae72996c901bf1520fdd2ad4cf649e25ae72996c901bf1520fdd2ad4cf649e25ae72996c901bf1520fdd2ad4cf649e25ae72996c901bf1' | python -c 'something here'))

I've spent the last hour googling but can't seem to find the right answer.

Upvotes: 2

Views: 106

Answers (1)

falsetru
falsetru

Reputation: 369074

Using regular expression to match http:.... or https:....

import re
import sys

matched = re.findall(r"https?:\S+", sys.stdin.read())
print(matched)

By converting into single line...

URLS=$(echo '....' | python -c 'import re, sys; print(re.findall(r"https?:\S+", sys.stdin.read()))')

Upvotes: 2

Related Questions