Kevin Walton
Kevin Walton

Reputation: 21

How to iterate/loop over a text string that contains HTML tags while ignoring those HTML tags in Javascript?

I have some dynamic copy text, a string, which contains some HTML tags used for line breaks and styling. E.g.: "This is a sample <b>piece of text</b> string".

I wish to iterate over the text so I can wrap each character in <span> </span> tags, in order to animate each character. However, I need to be able to ignore the HTML tags, like the <b></b> tags above so that any/all breaks or styles associated with those tags remains in place...?

I can run a simple, clean for of loop over the text string in order to wrap everything in span tags and then push that result to the div I want, see below:

textString = "This is a sample <b>piece of text</b> string."

for (let char of textString) {
    let span = document.createElement('span');
    span.textContent = char;
    let test = document.querySelector("#root");
    test.appendChild(span);
}

This will wrap EVERYTHING in that string in a span tag but then the tags are obviously shown on the screen - not what I need!

I know this regex: str.replace( /(<([^>]+)>)/ig, ''); will remove all the HTML tags from the string but, as I said, I need the tags to actually remain in place for styling purposes?

Any help on this would be most appreciated...🙏🏼

P.S.: The reason the HTML tags must be kept/replaced where they were and I cannot just style afterwards is because the text strings that I receive are coming from a feed that has the tags entered there and they are being used to create line breaks and colour styles for some portions of the text that are being fed into a creative HTML5 banner ad.

Upvotes: 1

Views: 1357

Answers (2)

Kevin Walton
Kevin Walton

Reputation: 21

This is how I fix it :

const isElement = node => node.nodeType === 1; // element
const isTextNode = node => node.nodeType === 3; // text 
const wrap = node => (node || '').split('').map(c => `<span>${c}</span>`).join('');

function wrapNodes(nodeList) {
  nodeList = Array.from(nodeList).map(node => {
    if (isTextNode(node))  {
      return wrap(node.textContent);
    }
    if (isElement(node)) {
      node.innerHTML = wrapNodes(node.childNodes)
      return node.outerHTML;
    }
  });
  
  return nodeList.join('');
}

let textSample = "This is some copy text with a <b>Bold</b> sample to <em>test</em>."
const div = document.createElement('div');
div.innerHTML = textSample; 
div.innerHTML = wrapNodes(div.childNodes);

Upvotes: 0

tao
tao

Reputation: 90227

Here's a helper function:

const letterize = el => el instanceof Text
  ? el.data
      .split('')
      .map(l => l === ' ' ? l : '<letter>' + l + '</letter>')
      .join('')
  : (
      el.innerHTML = ([...el.childNodes] || []).map(letterize).join(''),
      el.outerHTML
    );

which is called recursively while walking the DOM tree.

The assumption is: walking the DOM tree recursively will ultimately boil down to Text node instances (or to empty elements, like <br>). I haven't tested it thoroughly (not sure how it would work with <script>, <style> or <iframe> tags, or entire documents even, but those are edge cases). This should be enough for your case.

If dealing with a Text node, return the concatenated result of all contained chars (except spaces - returned verbatim), each wrapped in our tag wrapper.
If not Text node, run letterize on all child nodes, assign their concatenated results to current element's innerHTML, and return its outerHTML. If you also want the space characters wrapped, remove l === ' ' ? l : from first condition.
Change the wrapper (<letter> above) to whatever you want: e.g: <span>.

Test:

const doneClass = '2.1',
      tag = 'l',
      tagClass = '',
      letterize = (el) => el instanceof Text
  ? el.data
      .split('')
      .map(l => l === ' ' ? l : `<${tag}${tagClass ? ' class="' + tagClass + '"' : ''}>${l}</${tag}>`)
      .join('')
  : (!el.classList.contains(doneClass)
       ? el.innerHTML = (el.classList.add(doneClass), ([...el.childNodes] || []).map(letterize).join(''))
       : void 0,
      el.outerHTML
    );


[...document.querySelectorAll('.test')]
  .map(letterize);

// double parsing test:
[...document.querySelectorAll('.test')]
  .map(letterize);
l:hover {
  background-color: red;
  color: white;
  cursor: none;
}
body {
 font-family: monospace;
 font-size: 20px;
}
div > span { color: red; }
code {
  padding: 1rem;
  background-color: #272727;
  color: white;
  display: block;
  margin-bottom: 1rem;
  font-size: 14px;
}
<p class="test">This is a "<i>.test</i>"</p>
<code class="another test">And this is another one...</code>
<h1 class="test">I'll be parsed</h1>
<h4>I won't be parsed</h4>
<div style="border: 1px solid red;" class="test">
  I am a div. <span>And I am a <b>span</b>.</span>
</div>

The function in the test above is slightly different. It allows multiple runs without having to worry about what was letterized and what wasn't, in case you ever need such a version.

Big warning: the above (like anything replacing your DOM elements - e.g setting innerHTML or outerHTML values) will remove all events from the parsed DOM. If keeping event listeners is a requirement... it's a different level of complexity.
I won't cover it here.


Note: It doesn't have to be part of the page's DOM, you can do it all in memory (but it does use DOM API; the assumption is you'll run it in a browser):

const letterize = el => el instanceof Text
  ? el.data
      .split('')
      .map(l => l === ' ' ? l : '<span class="letter">' + l + '</span>')
      .join('')
  : (
      el.innerHTML = ([...el.childNodes] || []).map(letterize).join(''),
      el.outerHTML
    );

const letterizeHTML = input => {
  const d = document.createElement('div');
  d.innerHTML = input;
  letterize(d);
  return d.innerHTML;
}

console.log(
  letterizeHTML(`<p class="test">This is a <b>test</b>.</p>
<code class="another test">And this is another one...</code>`)
);

And here's an animation test (couldn't help it :):

const initState = document.querySelector('.animation-container').innerHTML;

const letterize = el => el instanceof Text
  ? el.data
    .split('')
    .map(l => l === ' ' ? l : '<letter>' + l + '</letter>')
    .join('')
  : (
    el.innerHTML = ([...el.childNodes] || []).map(letterize).join(''),
    el.outerHTML
  );

const animate = () => {
  [...document.querySelectorAll('.test')].map(letterize);
  [...document.querySelectorAll('letter')].forEach((l, k) => {
    setTimeout(() => l.classList.add('hiccup'), 12 * k + 4.2e3);
    setTimeout(() => l.classList.add('on'), Math.random() * 2e3 + 10 * k)
  });
}

animate();

function replay() {
  document.querySelector('.animation-container').innerHTML = initState;
  requestAnimationFrame(animate);
}
* {
  box-sizing: border-box;
}

p {
  margin-bottom: 0;
}

.animation-container {
  perspective-origin: center;
  position: relative;
  min-height: 100vh;
  overflow: hidden;
  perspective: 0;
  font-family: sans-serif;
  padding: 2rem 2rem 10rem;
  font-family: monospace;
  font-size: 20px;
  overflow: hidden;
  max-width: 800px;
  margin: 0 auto;
  text-rendering: geometricPrecision;
  -webkit-font-smoothing: subpixel-antialiased;
}

letter {
  transform-style: preserve-3d;
  perspective-origin: center;
  display: inline-block;
  position: relative;
  opacity: .21;
  transform: perspective(240px) translate3d(20px, -20px, 240px);
  transition: transform 1.8s cubic-bezier(.5, 0, .3, 1), opacity .7s ease-out;
}

letter.on {
  opacity: .5;
  transform: perspective(240px) translate3d(0, 0, 0);
}

letter.hiccup {
  opacity: 1;
  animation: hiccup 8s cubic-bezier(.4, 0, .2, 1) -6s infinite;
}

div>span {
  color: red;
}

code {
  padding: 1rem;
  background-color: #272727;
  color: white;
  display: block;
  margin-bottom: 1rem;
  font-size: 14px;
}

@keyframes hiccup {
  0% {
    transform: perspective(240px) translate3d(0, 0, 0) scale(1);
  }
  96% {
    transform: perspective(240px) translate3d(0, 0, 0) scale(1);
  }
  98% {
    transform: perspective(240px) translate3d(0, 0, 0) scale(1.5);
  }
  100& {
    transform: perspective(240px) translate3d(0, 0, 0) scale(1);
  }
}
<div class="animation-container">
  <p class="test">This is a "<i>.test</i>"</p>
  <code class="another test">And this is another one...</code>
  <h1 class="test">I'll be parsed</h1>
  <h4>I won't be parsed</h4>
  <div style="border: 1px solid red; padding: 0 1rem;" class="test">
    I am a div. <span>And I am a <b>span</b>.</span>
    <p>Lorem ipsum dolor sit amet. The cat is sleeping. <br>Over.</p>
    <p>Sometimes a financial Bacardi Silver flies into a rage, but a line dancer always makes a pact with the Keystone light! When a jersey cow about some Heineken is familiar, a tipsy jersey cow makes a pact with the cantankerous Coors. <br><br> Hold my
      beer!</p>
  </div>
  <button onclick="replay()">Replay animation</button>
</div>

Final note: while it's interesting and useful to know how these things work and how you can code them from scratch, if you're serious about animations, I suggest you take a look at more serious tools, providing handy methods of tweening, chaining, pausing and/or reverting animations which, in the not so distant past, took a lot longer to code.
Check out the video about the API changes in GSAP 3 (and what it's capable of, with almost no code!), or get your mind blown on their showcase page.

P.S. I'm not affiliated with greensock in any way. I'm just passionate about animations and, for obvious reasons, I've used their lib quite a bit.


Oh, and why I enjoyed the challenge, I still think you should check out letterize.js. It's been around for a while, so it was tested in a lot of scenarios, unlike what I wrote here.

Upvotes: 1

Related Questions