xxddd_69
xxddd_69

Reputation: 83

extract domain name from url javascript

I'm trying to extract domain name from string in 'tweets', how can I avoid to extract double backward slashes from string? the regular expression I have defined in let url

let tweets = [
  "Thank you to the Academy and the incredible cast & crew of #TheRevenant. #Oscars",
  "@HardingCompSci department needs student volunteers for #HourOfCode https://hourofcode.com/us",
  "Checkout the most comfortable earbud on #Kickstarter and boost your #productivity https://www.kickstarter.com/",
  "Curious to see how #StephenCurry handles injury. http://mashable.com/2016/04/25/steph-curry-knee-injury-cries-cried/"
];


let url = /\/\/.+?\.com?/;

tweets.forEach(function(tweet) {
  console.log(url.exec(tweet));
});

Upvotes: 3

Views: 568

Answers (2)

Stijn de Witt
Stijn de Witt

Reputation: 42055

Use a Capturing Group

A part of a pattern can be enclosed in parentheses (...). This is called a “capturing group”.

That has two effects:

It allows to get a part of the match as a separate item in the result array. If we put a quantifier after the parentheses, it applies to the parentheses as a whole.

In your code you have let url = /\/\/.+?\.com?/;

You are only interested in the part following the 2 slashes, so make a capturing group for that by enclosing it in braces: let url = /\/\/(.+?\.com?)/;

Then change the code in the loop a bit to get the result from the first capturing group and you end up with:

let tweets = [
  "Thank you to the Academy and the incredible cast & crew of #TheRevenant. #Oscars",
  "@HardingCompSci department needs student volunteers for #HourOfCode https://hourofcode.com/us",
  "Checkout the most comfortable earbud on #Kickstarter and boost your #productivity https://www.kickstarter.com/",
  "Curious to see how #StephenCurry handles injury. http://mashable.com/2016/04/25/steph-curry-knee-injury-cries-cried/"
];


let url = /\/\/(.+?\.com?)/;

tweets.forEach(function(tweet) {
  var match = url.exec(tweet)
  console.log(match && match[1] || match);
});

Upvotes: 1

NVRM
NVRM

Reputation: 13047

Made a quick script for your query, using the new URL() constructor.

It splits your tweets by words and test them. When an URL is found, the urls array is populated.

let tweets = [
       "Thank you to the Academy and the incredible cast & crew of #TheRevenant. #Oscars",
       "@HardingCompSci department needs student volunteers for #HourOfCode https://hourofcode.com/us",
       "Checkout the most comfortable earbud on #Kickstarter and boost your #productivity https://www.kickstarter.com/",
       "Curious to see how #StephenCurry handles injury. http://mashable.com/2016/04/25/steph-curry-knee-injury-cries-cried/"
    ];
 
let urls = []
 
function getURL(me){
  me.split(" ").forEach(function(e){
    try { 
      new URL(e);
      console.log(e + " is a valid URL!")
      urls.push(e)
    } 
    catch (error){
      console.log(error.message);
    }
  })

}

tweets.forEach(function(tweet){
  getURL(tweet)
})

url.innerHTML = urls.join("<br>")
<div id="url"></div>

Upvotes: 0

Related Questions