Reputation: 17881
How can I parse URLs from any give plain text (not limited to href attributes in tags)?
Any code examples in Python will be appreciated.
Upvotes: 1
Views: 2928
Reputation: 336108
See Jan Goyvaerts' blog.
So a Python code example could look like
result = re.findall(r"\b(?:(?:https?|ftp|file)://|www\.|ftp\.)[-A-Z0-9+&@#/%=~_|$?!:,.]*[A-Z0-9+&@#/%=~_|$]", subject)
Upvotes: 1
Reputation: 47292
You could use a Regular Expression to parse the string.
Look in this previously asked question: What’s the cleanest way to extract URLs from a string using Python?
Upvotes: 2