Reputation: 11
I am trying to write a regexp that removes file paths from links and images.
href="path/path/file" to href="file"
href="/file" to href="file"
src="/path/file" to src="file"
and so on...
I thought that I had it working, but it messes up if there are two paths in the string it is working on. I think my expression is too greedy. It finds the very last file in the entire string.
This is my code that shows the expression messing up on the test input:
<script type="text/javascript" src="/javascripts/jquery.js"></script>
<script type="text/javascript">
$(document).ready(function(){
var s = '<a href="one/keepthis"><img src="/one/two/keep.this"></a>';
var t = s.replace(/(src|href)=("|').*\/(.*)\2/gi,"$1=$2$3$2");
alert(t);
});
</script>
It gives the output:
<a href="keep.this"></a>
The correct output should be:
<a href="keepthis"><img src="keep.this"></a>
Thanks for any tips!
Upvotes: 1
Views: 1501
Reputation: 39808
It doesn't have to be a regular expression (assuming /
delimiters):
var fileName = url.split('/').pop(); //pop takes the last element
Upvotes: 1
Reputation: 361585
Try adding ?
to make the *
quantifiers non-greedy. You want them to stop matching when they encounter the ending quote character. The greedy versions will barrel right on past the ending quote if there's another quote later in the string, finding the longest possible match; the non-greedy ones will find the shortest possible match.
/(src|href)=("|').*?\/([^/]*?)\2/gi
Also I changed the second .*
to [^/]*
to allow the first .*
to still match the full path now that it's non-greedy.
Upvotes: 0
Reputation: 11
This seems to work in case anyone else has the problem:
var t = s.replace(/(src|href)=('|")([^ \2]*\/)*\/?([^ \2]*)\2/gi,"$1=$2$4$2");
Upvotes: 0
Reputation: 186
I would suggest run separate regex replacement, one for a links and another for img, easier and clearer, thus more maintainable.
Upvotes: 0