Reputation: 425
How can I remove all urls within a string regardless of where they appear using Javascript?
For example, for the following tweet-
"...Ready For It?" (@BloodPop ® Remix) out now - https://example.com/rsKdAQzd2q
I would like to get back
"...Ready For It?" (@BloodPop ® Remix) out now -
Upvotes: 2
Views: 2562
Reputation: 4333
To remove all urls from the string, you can use regex to identify all the urls that are there in the string and then use String.prototype.replace
to replace all the urls with empty characters.
This is John Grubber's Regex which can be used to match all urls.
/\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/g
So to replace all the urls just run a replace with the above regex
let originalString = '"...Ready For It?" (@BloodPop ® Remix) out now - https://example.com/rsKdAQzd2q'
let newString = originalString.replace(/\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/g,'')
console.log(newString)
Upvotes: 4
Reputation: 3012
function removeUrl(input) {
let regex = /http[%\?#\[\]@!\$&'\(\)\*\+,;=:~_\.-:\/ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789]*/;
let result = input.replace(regex, '');
return result;
}
let result = removeUrl('abc http://helloWorld" sdfsewr');
Upvotes: 0
Reputation: 30715
You can use a regular expression replace on the string to do this, however, finding a good expression to match all URLs is awkward. However something like:
str = str.replace(regex, '');
The correct regex to use has been the subject of many StackOverflow questions, it depends on whether you need to match only http(s)://xxx.yyy.zzz or a more general pattern such as www.xxx.yyy.
See this question for regex patterns to use: What is the best regular expression to check if a string is a valid URL?
Upvotes: 0
Reputation: 564
First you can split it by white space
var givenText = '...Ready For It?" https://example2.com/rsKdAQzd2q (@BloodPop ® Remix) out now - https://example.com/rsKdAQzd2q'
var allWords = givenText.split(' ');
Than you can filter out non url words using your own implementation for checking URL, here we can check index of :// for simplicity
var allNonUrls = allWords.filter(function(s){ return
s.indexOf('://')===-1 // you can call custom predicate here
});
So you non URL string will be:
var outputText = allNonUrls.join(' ');
// "...Ready For It?" (@BloodPop ® Remix) out now - "
Upvotes: 0
Reputation: 163577
If your urls do not contain a literal whitespace, you could use a regex https?.*?(?= |$)
to match from http with an optional s to the next whitespace or end of the string:
var str = '...Ready For It?" (@BloodPop ® Remix) out now - https://example.com/rsKdAQzd2q';
str = str.replace(/https?.*?(?= |$)/g, "");
console.log(str);
Or split on a whitespace and check if the parts start with "http" and if so remove them.
var string = "...Ready For It?\" (@BloodPop ® Remix) out now - https://example.com/rsKdAQzd2q";
string = string.split(" ");
for (var i = 0; i < string.length; i++) {
if (string[i].substring(0, 4) === "http") {
string.splice(i, 1);
}
}
console.log(string.join(" "));
Upvotes: 0