user8987378
user8987378

Reputation: 332

find link in string html with regex

i need to find all links in html string only in href on both cases of double qute("") or single qute('')

example:

<a href='text'>

or

<div href="text">;

i came up with

function findHrefValues(str) {
  let hrefs = [];
  let pattern = /href='([^']+)'/g;
  let match = pattern.exec(str);
  if(match && Array.isArray(match)) {
    match.forEach((href)=> {
      if(href) hrefs.push(href);
    });
  }
  return hrefs;
}

but its not working well it doesn't recognize double qute.

Upvotes: 2

Views: 578

Answers (2)

CertainPerformance
CertainPerformance

Reputation: 370759

Capture the first ' or " right after href, then use a character set that includes anything but that same quote character via a backreference, then use the backreference again to match the end of the href:

const str = `<a href='tex""t1'>
<div href="tex''t2">`;

function findHrefValues(str) {
  const re = /href=(['"])([^\1]+?)\1/g;
  const matches = [];
  let match;
  while ((match = re.exec(str)) !== null) {
    matches.push(match[2]);
  }
  return matches;
}

console.log(findHrefValues(str));

But, if at all possible, don't use a regular expression for this - parse the HTML string instead, possibly with DOMParser:

const str = `<a href='text1'>
<div href="text2">`;
const doc = new DOMParser().parseFromString(str, 'text/html');
const hrefs = Array.from(
  doc.querySelectorAll('[href]'),
  element => element.getAttribute('href')
);
console.log(hrefs);

Upvotes: 3

D Satish Kumar Achary
D Satish Kumar Achary

Reputation: 66

you can try this code with

let pattern=/href=('|")([^']+)('|")/g;

Upvotes: 1

Related Questions