Fabiano Soriani
Fabiano Soriani

Reputation: 8572

RegExp for stripping URLs from a string

I want to take a Twitter text like this:

s = "Today 09/07 sunday http://t.co/123 - AC/DC COVER Opening and DVD - woman R$10 / man R$15. - http://migre.me/59qwc"

and turn it into this..

s = "Today 09/07 sunday LINK - AC/DC COVER Opening and DVD - woman R$10 / man R$15. - LINK"

This snippet is failing for some reason, please, some help

s.replace(/(http\:.*)\s/g , 'LINK')

Upvotes: 1

Views: 138

Answers (5)

The Mask
The Mask

Reputation: 17467

try:

input.replace(/http:\/{2}[^\s]+/,"link")

Upvotes: 1

Mike Samuel
Mike Samuel

Reputation: 120586

Try using

/\bhttps?\:\S*/ig

which uses \S* to match runs of non-space characters so won't have problems matching at the end of input where there is no following space.

Upvotes: 3

akshayp
akshayp

Reputation: 1

This should strip HTML from your text

s.replace(/<.*?>/g, '');

Upvotes: 0

Whoopska
Whoopska

Reputation: 149

As stated, .* will match whitespace and thus replace everything. Depending on the system you are using, you may be able to get away with something like \S*, which matches only non-whitespace characters, or else a more explicit [^ ]* instead.

Upvotes: 0

Maarten Bodewes
Maarten Bodewes

Reputation: 94088

.* will eat all, including whitespace, so this finds everything, until it cannot go further, then it backtracks to find the single whitespace character. You'll have to match only non-whitespace characters for the URL and you will be done.

Upvotes: 0

Related Questions