Reputation: 83
I'm trying to extract domain name from string in 'tweets', how can I avoid to extract double backward slashes from string? the regular expression I have defined in let url
let tweets = [
"Thank you to the Academy and the incredible cast & crew of #TheRevenant. #Oscars",
"@HardingCompSci department needs student volunteers for #HourOfCode https://hourofcode.com/us",
"Checkout the most comfortable earbud on #Kickstarter and boost your #productivity https://www.kickstarter.com/",
"Curious to see how #StephenCurry handles injury. http://mashable.com/2016/04/25/steph-curry-knee-injury-cries-cried/"
];
let url = /\/\/.+?\.com?/;
tweets.forEach(function(tweet) {
console.log(url.exec(tweet));
});
Upvotes: 3
Views: 568
Reputation: 42055
A part of a pattern can be enclosed in parentheses (...). This is called a “capturing group”.
That has two effects:
It allows to get a part of the match as a separate item in the result array. If we put a quantifier after the parentheses, it applies to the parentheses as a whole.
In your code you have let url = /\/\/.+?\.com?/;
You are only interested in the part following the 2 slashes, so make a capturing group for that by enclosing it in braces: let url = /\/\/(.+?\.com?)/;
Then change the code in the loop a bit to get the result from the first capturing group and you end up with:
let tweets = [
"Thank you to the Academy and the incredible cast & crew of #TheRevenant. #Oscars",
"@HardingCompSci department needs student volunteers for #HourOfCode https://hourofcode.com/us",
"Checkout the most comfortable earbud on #Kickstarter and boost your #productivity https://www.kickstarter.com/",
"Curious to see how #StephenCurry handles injury. http://mashable.com/2016/04/25/steph-curry-knee-injury-cries-cried/"
];
let url = /\/\/(.+?\.com?)/;
tweets.forEach(function(tweet) {
var match = url.exec(tweet)
console.log(match && match[1] || match);
});
Upvotes: 1
Reputation: 13047
Made a quick script for your query, using the new URL() constructor.
It splits your tweets by words and test them. When an URL is found, the urls
array is populated.
let tweets = [
"Thank you to the Academy and the incredible cast & crew of #TheRevenant. #Oscars",
"@HardingCompSci department needs student volunteers for #HourOfCode https://hourofcode.com/us",
"Checkout the most comfortable earbud on #Kickstarter and boost your #productivity https://www.kickstarter.com/",
"Curious to see how #StephenCurry handles injury. http://mashable.com/2016/04/25/steph-curry-knee-injury-cries-cried/"
];
let urls = []
function getURL(me){
me.split(" ").forEach(function(e){
try {
new URL(e);
console.log(e + " is a valid URL!")
urls.push(e)
}
catch (error){
console.log(error.message);
}
})
}
tweets.forEach(function(tweet){
getURL(tweet)
})
url.innerHTML = urls.join("<br>")
<div id="url"></div>
Upvotes: 0