Reputation: 1402
In string containing a lot of following url token -
[http://www.someurl.com/path/to/resource/?some=params&crazy_chars=true_0_1_0_1]
Which I want to capture and convert to
<a href="http://www.someurl.com/path/to/resource/?some=params&crazy_chars=true_0_1_0_1" target="_blank" class="exturl">http://www.someurl.com/path/to/resource/?some=params&crazy_chars=true_0_1_0_1</a>
So all the urls inside square bracket would be search and replaced by inline url to element. Current I found Regex for URL pattern as -
RegExp("\[(http|ftp|https)://[\w-]+(\.[\w-]+)+([\w.,@?^=%&:/~+#-]*[\w@?^=%&/~+#-])?\]", "gi");
But I am still not clear on how I can do it in single pass. Do I have to loop for till no matcher is found?
Upvotes: 2
Views: 3238
Reputation: 20838
I would write a helper function that takes a single url string as input and return the anchor tag with that url on match. Parse the big string into an array with each element matching a corresponding []
pair. Then it's just a matter of iterating over this array and passing it into the helper function:
function urlify(s)
{
var urlpat = /\[((https?|ftp):\/\/\w+[^\]]*)\]/i;
var matches = urlpat.exec(s);
var anchor_url = '<a href="%1">%1</a>';
return matches ? anchor_url.replace(/%1/g, matches[1]) : '';
}
instring = '[http://www.someurl.com/path/to/resource/?some=params&crazy_chars=true_0_1_0_1]' +
'[@ID 65421]' +
'[http://google.com]';
var arr = instring.match( /(\[[^\]]+\])/g );
for(var each in arr)
{
arr[each] = urlify(arr[each]);
}
arr
will contain:
[ '<a href="http://www.someurl.com/path/to/resource/some=params&crazy_chars=true_0_1_0_1">http://www.someurl.com/path/to/resource/?some=params&crazy_chars=true_0_1_0_1</a>',
'',
'<a href="http://google.com">http://google.com</a>' ]
Upvotes: 0
Reputation: 664599
Current I found Regex for URL pattern
But it was intended to be a regex literal, not a string argument to the RegExp
constructor. All your backslashes do string-escape the following chars and have no effect in the regex. Instead, use
/\[(http|ftp|https):\/\/[\w-]+(\.[\w-]+)+([\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])?\]/gi
But I am still not clear on how I can do it in single pass. Do I have to loop for till no matcher is found?
No, a simple replace
call will suffice. You can put a capturing group around the url (between the square brackets) and then use the captures in the replacement string:
var regex = /\[((?:ftp|http)s?:\/\/[\w-]+(?:\.[\w-]+)+(?:[\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])?)\]/gi;
// here: ^ ^
// (the non-capturing groups are optional)
urlified = text.replace(regex, '<a href="$1" class="exturl">$1</a>');
// here: ^^ ^^
For more advanced replacement rules you might use the callback function parameter of replace
.
And of course you might (should) employ the regex improvements/simplifications the other answers suggested.
Upvotes: 2
Reputation: 10867
Let's suppose that:
Then this simple regex will do the trick:
\[[^@#]+\]
\[
matches an opening bracket (symbol needs to be escaped)[^@#]+
matches any character except @ and #, repeated 1 or more times\]
matches a closing bracket (symbol needs to be escaped)Upvotes: 0
Reputation: 46806
JavaScript's regex is moreless same as Java's.
The JTexy project (something like MarkDown, but better) has a lot of regexes for various tasks, including URL matching.
#(?<=^|[\\s(\\[<:\\x17])(?:https?://|www\\.|ftp://)[0-9.$TEXY_CHAR-][/\\d$TEXY_CHAR+\\.~%&?@=_:;\\#,\\xAD-]+[/\\d$TEXY_CHAR+~%?@=_\\#]#u
$TEXY_CHAR
is defined somewhere in the project.
By the way, using brackets to enclose URL isn't really a good idea, for example PHP uses [...]
for initializing hashes, often used for checkboxes.
Upvotes: 0