Mitya Ustinov
Mitya Ustinov

Reputation: 903

Replace all matches in string and return updated string

I have a string on a server (no DOM parsing), where I need to replace all matches and return an updated string. Challenging part is that match contains a part that I need to be preserved.

const data =
  '<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod <a data-type=\"embed\" href="http://google.com">Link</a> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <a data-type=\"embed\" href="http://wired.com">Link</a> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <a data-type=\"embed\" href="http://tested.com">Link</a> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
const regex = /<a data-type=\"embed\" href=\"http:\/\/.*?\">Link<\/a>/g;
const newData = [...data.matchAll(regex)].map((match) => {
  const [string] = match;
  const [end] = string
    .split(/<a data-type=\"embed\" href=\"/g)
    .filter((val) => val);
  const [href] = end.split(/\">/g);
  const newString = `<div data-href="${href}"></div>`;
  return newString;
});
console.log(newData);

So there in data are three matches eg. <a href="http://google.com">Link</a>, where I need to extract http://google.com etc. from and "inject" it into updated string. I managed to do this, but question is how to replace matches.

In fact I have a working solution with split/join:

const data =
  '<p>Lorem ipsum dolor sit amet, <a data-type=\"embed\" href="http://google.com">Another link</a> consectetur adipiscing elit, sed do eiusmod <a data-type=\"embed\" href="http://google.com">Link</a> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <a data-type=\"embed\" href="http://wired.com">Link</a> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <a data-type=\"embed\" href="http://tested.com">Link</a> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
const newData = data
  .split('<a data-type=\"embed\" href="')
  .join('<div data-href="')
  .split('">Link</a>')
  .join('"></div>');
console.log(newData);

This solution however will cause undesired results, if there is no match in second split/join, or it can modify also links, which are not supposed to be modified.

So my target is get a result as in the second case, but with correct replacements, as in first one.

Upvotes: 0

Views: 82

Answers (2)

Peter Thoeny
Peter Thoeny

Reputation: 7616

Here is a simple solution to get a single string back with links modified as desired:

const data =
  '<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod <a data-type=\"embed\" href="http://google.com">Link</a> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <a data-type=\"embed\" href="http://wired.com">Link</a> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <a data-type=\"embed\" href="http://tested.com">Link</a> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
const regex = /<a data-type="embed" href="(http:\/\/[^"]+)">Link<\/a>/g;

var result = data.replace(regex, '<div data-href="$1"></div>')
console.log(result);

Resulting output:

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod <div data-href="http://google.com"></div> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <div data-href="http://wired.com"></div> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <div data-href="http://tested.com"></div> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Old answer before clarification of desired output:

Here is a working solution:

const data = '<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod <a data-type=\"embed\" href="http://google.com">Link</a> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <a data-type=\"embed\" href="http://wired.com">Link</a> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <a data-type=\"embed\" href="http://tested.com">Link</a> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';

const regex = /<a data-type="embed" href="(http:\/\/[^"]+)">Link<\/a>/g;

function extractLinks(data) {
    var links = [];
    data.replace(regex, (m, p1) => {
        links.push(p1);
    });
    return links.map((link) => {
        return '<div data-href="' + link + '"></div>';
    });
}

console.log(extractLinks(data));

Upvotes: 1

mplungjan
mplungjan

Reputation: 178061

Answering the question before I was told it was a string on a server

Don't use regex to parse HTML

const data = '<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod <a data-type=\"embed\" href="http://google.com">Link</a> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud <a data-type=\"embed\" href="http://wired.com">Link</a> exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur <a data-type=\"embed\" href="http://tested.com">Link</a> sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>';

const container = document.createElement("div");
container.innerHTML = data;
[...container.querySelectorAll("[data-type=embed]")].forEach(link => {
  console.log(link)
  const div = document.createElement("div");
  div.dataset.href = link.href;
  link.parentNode.replaceChild(div, link);
})
console.log(container.innerHTML)

Upvotes: 2

Related Questions