Reputation: 534
Being noob in regex I require some support from community
Let say I have this string str
www.anysite.com hello demo try this link
anysite.com indeed demo link
http://www.anysite.com another one
www.anysite.com
Consider 1-5 as whole string str here
I want to convert all 'anysite.com' into clickable html links, for which I am using:
str = str.replace(/((http|https|ftp):\/\/[\w?=&.\/-;#~%-]+(?![\w\s?&.\/;#~%"=-]*>))/g, '<a href="$1" target="_blank">$1</a>');
This converts all space separated words starting with http/https/ftp into links as
<a href="url" target="_blank">url</a>
So, line 3 and line 5 has been converted correctly. Now to convert all www.anysite.com into links I again used
str = str.replace(/(\b^(http|https|ftp)?(www\.)[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig, '<a href="https://$1" target="_blank">$1</a>');
Though it only converts www.anysite.com into link if it is found at very beginning of str. So it convert line number 1 but not line number 4.
Note that I have used ^(http|https|ftp)?(www.) to find all www not starting with http/https/ftp, as for http they already have been converted
Also the link on line number 2, where it is neither started with http nor www rather it ends with .com, how the regex would be for that.
For reference you can try posting this whole string to you facebook timeline, it converts all five line into links. Check snapshot
Upvotes: 0
Views: 1657
Reputation: 24812
^(http|https|ftp)?(www\.)
does not mean "all www not starting with http/https/ftp" but rather "a string that starts with an optional http/https/ftp
followed by www.
.
Indeed, ^
in this context isn't a negation but rather an anchor representing the start of the string. I suppose you used it this way because of its meaning when used in a character class ([^...]
) ; it is rather tricky since its meaning change depending on the context it is found in.
You could just remove it and you should be fine, as I see no point of making sure the string does not start with http/https/ftp
(you transformed those occurrences just before, there should be none left).
Edit : I mentioned lookbehind but forgot it's not available in JS...
If you wanted to make some kind of negation, the easiest way would be to use a negative lookbehind :
(?<!http|https|ftp)www\.
This matches "www." only when it's not preceded by http, https nor ftp.
Upvotes: 0
Reputation: 534
Thanks for help, the final RegEx that helped me is:
//remove all http:// and https://
str = str.replace(/(http|https):\/\//ig, "");
//replace all string ending with .com or .in only into link
str = str.replace( /((www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.(com|in))/ig, '<a href="//$1" target="_blank">$1</a>');
I used .com and .in for my specific requirement, else the solution on this http://regexr.com/39i0i will work
Though sill there is issue like- it doesn't convert shortened url into links perfectly. e.g http://s.ly/qhdfTyuiOP will give link till s.ly
Still any suggestions?
Upvotes: 0