Jasjeev
Jasjeev

Reputation: 425

Remove all urls in a string using Javascript

How can I remove all urls within a string regardless of where they appear using Javascript?

For example, for the following tweet-

"...Ready For It?" (@BloodPop ® Remix) out now -  https://example.com/rsKdAQzd2q

I would like to get back

"...Ready For It?" (@BloodPop ® Remix) out now - 

Upvotes: 2

Views: 2562

Answers (5)

illiteratewriter
illiteratewriter

Reputation: 4333

To remove all urls from the string, you can use regex to identify all the urls that are there in the string and then use String.prototype.replace to replace all the urls with empty characters.

This is John Grubber's Regex which can be used to match all urls.

/\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/g

So to replace all the urls just run a replace with the above regex

let originalString = '"...Ready For It?" (@BloodPop ® Remix) out now -  https://example.com/rsKdAQzd2q'
let newString = originalString.replace(/\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/g,'')
console.log(newString)

Upvotes: 4

yonexbat
yonexbat

Reputation: 3012

function removeUrl(input) {
            let regex = /http[%\?#\[\]@!\$&'\(\)\*\+,;=:~_\.-:\/ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789]*/; 
            let result = input.replace(regex, '');
            return result;
}

let result = removeUrl('abc http://helloWorld" sdfsewr');

Upvotes: 0

Terry Lennox
Terry Lennox

Reputation: 30715

You can use a regular expression replace on the string to do this, however, finding a good expression to match all URLs is awkward. However something like:

str = str.replace(regex, '');

The correct regex to use has been the subject of many StackOverflow questions, it depends on whether you need to match only http(s)://xxx.yyy.zzz or a more general pattern such as www.xxx.yyy.

See this question for regex patterns to use: What is the best regular expression to check if a string is a valid URL?

Upvotes: 0

vijay22uk
vijay22uk

Reputation: 564

First you can split it by white space

var givenText = '...Ready For It?" https://example2.com/rsKdAQzd2q (@BloodPop ® Remix) out now -  https://example.com/rsKdAQzd2q'
var allWords = givenText.split(' ');

Than you can filter out non url words using your own implementation for checking URL, here we can check index of :// for simplicity

    var allNonUrls = allWords.filter(function(s){ return 
      s.indexOf('://')===-1 // you can call custom predicate here
  });

So you non URL string will be:

var outputText = allNonUrls.join(' ');
// "...Ready For It?" (@BloodPop ® Remix) out now - "

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163577

If your urls do not contain a literal whitespace, you could use a regex https?.*?(?= |$) to match from http with an optional s to the next whitespace or end of the string:

var str = '...Ready For It?" (@BloodPop ® Remix) out now -  https://example.com/rsKdAQzd2q';
str = str.replace(/https?.*?(?= |$)/g, "");
console.log(str);

Or split on a whitespace and check if the parts start with "http" and if so remove them.

var string = "...Ready For It?\" (@BloodPop ® Remix) out now -  https://example.com/rsKdAQzd2q";
string = string.split(" ");

for (var i = 0; i < string.length; i++) {
  if (string[i].substring(0, 4) === "http") {
    string.splice(i, 1);
  }
}
console.log(string.join(" "));

Upvotes: 0

Related Questions