Reputation: 332
i need to find all links in html string only in href on both cases of double qute("") or single qute('')
example:
<a href='text'>
or
<div href="text">;
i came up with
function findHrefValues(str) {
let hrefs = [];
let pattern = /href='([^']+)'/g;
let match = pattern.exec(str);
if(match && Array.isArray(match)) {
match.forEach((href)=> {
if(href) hrefs.push(href);
});
}
return hrefs;
}
but its not working well it doesn't recognize double qute.
Upvotes: 2
Views: 578
Reputation: 370759
Capture the first '
or "
right after href
, then use a character set that includes anything but that same quote character via a backreference, then use the backreference again to match the end of the href
:
const str = `<a href='tex""t1'>
<div href="tex''t2">`;
function findHrefValues(str) {
const re = /href=(['"])([^\1]+?)\1/g;
const matches = [];
let match;
while ((match = re.exec(str)) !== null) {
matches.push(match[2]);
}
return matches;
}
console.log(findHrefValues(str));
But, if at all possible, don't use a regular expression for this - parse the HTML string instead, possibly with DOMParser
:
const str = `<a href='text1'>
<div href="text2">`;
const doc = new DOMParser().parseFromString(str, 'text/html');
const hrefs = Array.from(
doc.querySelectorAll('[href]'),
element => element.getAttribute('href')
);
console.log(hrefs);
Upvotes: 3
Reputation: 66
you can try this code with
let pattern=/href=('|")([^']+)('|")/g;
Upvotes: 1