Reputation: 903
I have a string on a server (no DOM parsing), where I need to replace all matches and return an updated string. Challenging part is that match contains a part that I need to be preserved.
const data =
'<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod <a data-type=\"embed\" href="http://google.com">Link</a> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <a data-type=\"embed\" href="http://wired.com">Link</a> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <a data-type=\"embed\" href="http://tested.com">Link</a> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
const regex = /<a data-type=\"embed\" href=\"http:\/\/.*?\">Link<\/a>/g;
const newData = [...data.matchAll(regex)].map((match) => {
const [string] = match;
const [end] = string
.split(/<a data-type=\"embed\" href=\"/g)
.filter((val) => val);
const [href] = end.split(/\">/g);
const newString = `<div data-href="${href}"></div>`;
return newString;
});
console.log(newData);
So there in data
are three matches eg. <a href="http://google.com">Link</a>
, where I need to extract http://google.com
etc. from and "inject" it into updated string. I managed to do this, but question is how to replace matches.
In fact I have a working solution with split/join
:
const data =
'<p>Lorem ipsum dolor sit amet, <a data-type=\"embed\" href="http://google.com">Another link</a> consectetur adipiscing elit, sed do eiusmod <a data-type=\"embed\" href="http://google.com">Link</a> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <a data-type=\"embed\" href="http://wired.com">Link</a> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <a data-type=\"embed\" href="http://tested.com">Link</a> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
const newData = data
.split('<a data-type=\"embed\" href="')
.join('<div data-href="')
.split('">Link</a>')
.join('"></div>');
console.log(newData);
This solution however will cause undesired results, if there is no match in second split/join
, or it can modify also links, which are not supposed to be modified.
So my target is get a result as in the second case, but with correct replacements, as in first one.
Upvotes: 0
Views: 82
Reputation: 7616
Here is a simple solution to get a single string back with links modified as desired:
const data =
'<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod <a data-type=\"embed\" href="http://google.com">Link</a> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <a data-type=\"embed\" href="http://wired.com">Link</a> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <a data-type=\"embed\" href="http://tested.com">Link</a> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
const regex = /<a data-type="embed" href="(http:\/\/[^"]+)">Link<\/a>/g;
var result = data.replace(regex, '<div data-href="$1"></div>')
console.log(result);
Resulting output:
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod <div data-href="http://google.com"></div> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <div data-href="http://wired.com"></div> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <div data-href="http://tested.com"></div> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Old answer before clarification of desired output:
Here is a working solution:
const data = '<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod <a data-type=\"embed\" href="http://google.com">Link</a> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <a data-type=\"embed\" href="http://wired.com">Link</a> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <a data-type=\"embed\" href="http://tested.com">Link</a> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
const regex = /<a data-type="embed" href="(http:\/\/[^"]+)">Link<\/a>/g;
function extractLinks(data) {
var links = [];
data.replace(regex, (m, p1) => {
links.push(p1);
});
return links.map((link) => {
return '<div data-href="' + link + '"></div>';
});
}
console.log(extractLinks(data));
Upvotes: 1
Reputation: 178061
Answering the question before I was told it was a string on a server
Don't use regex to parse HTML
const data = '<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod <a data-type=\"embed\" href="http://google.com">Link</a> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <a data-type=\"embed\" href="http://wired.com">Link</a> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <a data-type=\"embed\" href="http://tested.com">Link</a> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>';
const container = document.createElement("div");
container.innerHTML = data;
[...container.querySelectorAll("[data-type=embed]")].forEach(link => {
console.log(link)
const div = document.createElement("div");
div.dataset.href = link.href;
link.parentNode.replaceChild(div, link);
})
console.log(container.innerHTML)
Upvotes: 2