Datacrawler
Datacrawler

Reputation: 2876

Replace HTML tags or retrieve text within them

I am using a platform that generates an HTML script. I cannot change its format. The reason is that it is a private platform and that generates the data in a specific way. I create a query (let's say Month and Number) and that shows the data in an HTML format as shown below:

<table>
<tbody>
    <tr>
        <td>
    <table>
    <tbody>
      <tr><td>Jan</td></tr>
      <tr><td>Feb</td></tr>
      <tr><td>Mar</td></tr>
    </tbody>
    </table>
    </td>
        <td>
    <table>
    <tbody>
      <tr><td>1</td></tr>
      <tr><td>2</td></tr>
      <tr><td>3</td></tr>
    </tbody>
    </table>
    </td>
    </tr>
</tbody>
</table>

and I want to convert this to :

<table>
    <tr>
        <td>Jan</td>
        <td>Feb</td>
        <td>Mar</td>
    </tr>
    <tr>
        <td>1</td>
        <td>2</td>
        <td>3</td>
    </tr>
</table>

I want to use a replace function but that will apply for tags and not text as the one below :

$(document).ready(function(){
    $("button").click(function(){
        $("p:first").replaceWith("I HAVE REPLACED IT");
    });
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.0/jquery.min.js"></script>

<p>REPLACE THAT.</p>
<p>Leave it as is.</p>

<button>Replace the first sentence</button>

So, then I tried replacing the tags :

$('li').replaceWith(function () {
    return $('<td/>', {
        html: $(this).html()
    });
});
td {
    text-align: center;
    vertical-align: middle;
    line-height: 90px;
    background-color: #ccc;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<table>
    <tbody>
      <tr><li>Jan 2017</li></tr>
      <tr><li>Feb 2017</li></tr>
      <tr><li>Jun 2017</li></tr>
    </tbody>
    </table>

Then I tried replacing the tags from scratch but then I do not how to redistribute the values into the new table tags :

var regex = /<\/?([a-z])+\>/g;
    
var str =  '<table><tr><td>Jan</td><td>Feb</td><td>Mar</td></tr><tr><td>1</td><td>2</td><td>3</td></tr></table>';
var result = str.replace( regex, "");

document.write(result);

I think the ideal would be to edit the last snippet and make it replace the tags and then output (not in the console) the results. But I am not sure how to make this work.

Another idea is to get the text instead of the html tags using regex and for each match, allocate the results in a new table.

Please give me your thoughts. I might update the question. I have spent hours on this with no success.

UPD

I updated the snipped according to the first answer :

function myFunction() {
let htmlFromService = '<table><tbody><tr><td><table><tbody><tr><td>Jan</td></tr><tr><td>Feb</td></tr><tr><td>Mar</td></tr></tbody></table></td><td><table><tbody><tr><td>1</td></tr><tr><td>2</td></tr><tr><td>3</td></tr></tbody></table></td></tr></tbody></table>'

let fragment = window.HtmlFragment(htmlFromService);

let newTable = document.createElement('table');

Array
  .from(fragment.querySelectorAll('tbody tbody'))
  .forEach(tbody => {
    let tr = newTable.appendChild(document.createElement('tr'));
    Array
      .from(tbody.querySelectorAll('td'))
      .forEach(td => {
        tr.appendChild(td);
      });

  });

document.write(newTable.outerHTML);
}
<script src="https://npmcdn.com/[email protected]/lib/html-fragment.min.js"></script>
<head>

</head>

<body>

<script>

</script>

</body>

Upvotes: 0

Views: 65

Answers (1)

KevBot
KevBot

Reputation: 18888

When doing DOM manipulations, always try to use the DOM query API's instead of regular expressions. In this solution, I use DocumentFragments to make query-able DOM.

I did this by converting the string of HTML returned from the service into a DocumentFragment. From there, I was able to query into the element, and get the elements I wanted, and added them to a new table.

let htmlFromService = `<table><tbody><tr><td><table><tbody><tr><td>Jan</td></tr><tr><td>Feb</td></tr><tr><td>Mar</td></tr></tbody></table></td><td><table><tbody><tr><td>1</td></tr><tr><td>2</td></tr><tr><td>3</td></tr></tbody></table></td></tr></tbody></table>`

let fragment = window.HtmlFragment(htmlFromService);

let newTable = document.createElement('table');

Array
  .from(fragment.querySelectorAll('tbody tbody'))
  .forEach(tbody => {
    let tr = newTable.appendChild(document.createElement('tr'));
    Array
      .from(tbody.querySelectorAll('td'))
      .forEach(td => {
        tr.appendChild(td);
      });

  });

console.log(newTable.outerHTML);
<script src="https://npmcdn.com/[email protected]/lib/html-fragment.min.js"></script>

Disclaimer: I used an npm package I wrote myself a while back to convert the HTML to a document fragment called html-fragment.

Upvotes: 2

Related Questions