Sreeraj
Sreeraj

Reputation: 2720

Regular expression to explode the URLs

When I am trying to explode the url from one string, its not returning the actual URL. Please find the def I have used

def self.getUrlsFromString(str="")
    url_regexp = /(?:http|https):\/\/[a-z0-9]+(?:[\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(?:(?::[0-9]{1,5})?\/[^\s]*)?/ix
    url        = str.split.grep(url_regexp)
    return url
rescue Exception =>  e
    DooDooLogger.log(e.message,e)
    return ""
end

when I do self.getUrlsFromString(" check this site...http://lnkd.in/HjUVii") it's returning

site...http://lnkd.in/HjUVii

Instead of

http://lnkd.in/HjUVii

Upvotes: 1

Views: 364

Answers (3)

Casper
Casper

Reputation: 34308

If you want to find all occurences in a string, you could use String#scan:

str = "check these...http://lnkd.in/HjUVii http://www.google.com/"

str.scan(url_regexp)
=> ["http://lnkd.in/HjUVii", "http://www.google.com/"]

Upvotes: 0

Emiliano Poggi
Emiliano Poggi

Reputation: 24816

Should not you use something much simpler as regex like:

/((http|https):[^\s]+)/

Upvotes: 0

bender
bender

Reputation: 1428

It's because grep in Array class returns an array of every element for element === pattern, so

str.split.grep(/http/ix)

will return ["site...http://lnkd.in/HjUVii"] too.

You can try instead of

str.split.grep(url_regexp)

something like this:

url_regexp.match(str).to_s

Upvotes: 1

Related Questions