Reputation: 11

Scrape an dynamically row table using Python, Selenium and XPath

I am trying to scrape using Selenium and XPath in Python, to get the "SIRET" row from the table. I have tried different types of XPaths, but I couldn't do it. One problem is that the " class="reportRow" " element is changing dynamically and it can't be scrapped after the position number. The "SIRET" raw and his "td class" subelements values, can be scrapped after the "SIRET" text or in some other way?

This are the manual steps that I am doing when I acces the site:

The site contain only the root domain. After I acces the site thru login data, I enter an search criteria, which open an page where I have to click on an link which open an popup window whith an table. The table contain 4 rows and 8 columns, the first row contains the name of the colums, and the other 3 rows contain data as the the "SIRET" one. The position of that 3 rows is changing regularly, depending on the data that is recievd from an specific server. That is why I want to scarpe that row and his values by the "SIRET" text.

My final scraped data should look like this: SIRET 646 90 0.2% $2.94 1.03 0.07 4.52.

Thank you very much for your inputs.

<div class="table_container">
<table>
    <tbody>
        <tr class="reportHead">.....</tr></tbody>
    <tbody>
        <tr class="reportRow  ">....</tr> 
        <tr class="reportRow  ">....</tr>
        <tr class="reportRow  ">
            <td data-actual="SIRET" class="reportKeyword">SIRET</td>
            <td class="td2">646</td>
            <td class="td1">90</td>
            <td class="rcr">0.2%</td>
            <td class="td1">$2.94</td>
            <td class="td1">1.03</td>
            <td class="td1">0.07</td>
            <td class="td1 rctl">4.52</td>
        </tr>
    </tbody>
    <tfoot style="display: none;">....</tfoot>
</table>

Upvotes: 1

Answers (3)

fpsthirty

Reputation: 185

Strange. As a matter of fact, the solution is not as intricate:

driver.find_element_by_xpath("//td[@data-actual='SIRET']/../td")

Upvotes: 0

undetected Selenium

Reputation: 193338

If I have understood the question correctly, you are trying to get the string "SIRET" from the <td> node which changes dynamically. To do that you can use the following line of code :

print(driver.find_element_by_xpath("//td[@class='reportKeyword']").get_attribute("innerHTML"))

Upvotes: 0

iamsankalp89

Reputation: 4749

You can use xpath like this

SIRET= driver.find_element_by_xpath("//td[@data-actual='SIRET']")

Then you can use .text operation to get text

if data is dyanmically change then you have to use

SIRET= driver.find_element_by_xpath("//td[@class='reportKeyword']")

Upvotes: 2

Scrape an dynamically row table using Python, Selenium and XPath

Answers (3)

Related Questions