demouser123
demouser123

Reputation: 4264

Scraping table to get specific data using puppeteer

<tbody class="ant-table-tbody">
  <tr class="ant-table-row ant-table-row-level-0">
    <td class>
      <span class="ant-table-row-indent indent-level-0" style="padding-left: 0px;"</span>
      "Bombay"
      </td>
    <td class>
       <label class="ant-checkbox-wrapper">
         <span class="ant-checkbox ant-checkbox-checked">
           <input type="checkbox" class="ant-checkbox-input" value="on">
       </label>
     </td>
    <td class>
       <div>
         <i class ="anticon anticon-delete">
           ::before
         </i>
       </div>
     </td>
  </tr>
<tr class="ant-table-row ant-table-row-level-0">...<tr>
<tr class="ant-table-row ant-table-row-level-0">...<tr>
<tr class="ant-table-row ant-table-row-level-0">...<tr>
<tr class="ant-table-row ant-table-row-level-0">...<tr>
<tr class="ant-table-row ant-table-row-level-0">...<tr>
<tr class="ant-table-row ant-table-row-level-0">...<tr>

I have this table structure where there are three separate <td> for each <tr> row. I am trying to find the following using puppeteer

Right now I can get all the text (inside all the tr and td ) using this

 const data = await page.evaluate(()=>{
            const tds = Array.from(document.querySelectorAll('tbody tr td'));
            return tds.map(td => td.innerText);
        });
        console.log(data);

But this returns all text data , which I don't need and I need only the specific data. How to drill down inside the specific tags using puppeteer?

Upvotes: 2

Views: 7303

Answers (1)

Grant Miller
Grant Miller

Reputation: 29047

You can use page.evaluate() to obtain the text content of the first column, and then you can use page.$$() to count the number of span elements in the second column containing the class ant-checkbox-checked:

let first_column_text = await page.evaluate(() => Array.from(document.querySelectorAll('.ant-table-tbody > .ant-table-row > td:first-child'), element => element.textContent.trim()));
let second_column_checked_count = (await page.$$('.ant-table-tbody > .ant-table-row > td:nth-child(2) > .ant-checkbox-wrapper > span.ant-checkbox-checked')).length;

Upvotes: 5

Related Questions