user9102437
user9102437

Reputation: 742

Selenium page source doesn't match the actual one

I was trying to parse tweets (let's say https://twitter.com/Tesla), but I ran into a problem that once I download the source code using html = browser.page_source it does not match what I see when inspecting the element (Ctrl+Shift+I). It shows some of the tweets, but not nearly all of them, moreover, when saving the code to file and opening it in Chrome, I get something incomprehensible. I had experience working with selenium before and have never ran into such a problem. Maybe there is some other function to get the source?

By the way, I know that Twitter provides an API, but they declined my request without giving any reasons even though I do not plan to do anything against their terms.

Upvotes: 0

Views: 341

Answers (1)

Justin Lambert
Justin Lambert

Reputation: 978

Hey this is one of worst practice in selenium

For multiple reasons, logging into sites like Gmail and Facebook using WebDriver is not recommended. Aside from being against the usage terms for these sites (where you risk having the account shut down), it is slow and unreliable.

The ideal practice is to use the APIs that email providers offer, or in the case of Facebook the developer tools service which exposes an API for creating test accounts, friends and so forth. Although using an API might seem like a bit of extra hard work, you will be paid back in speed, reliability, and stability. The API is also unlikely to change, whereas webpages and HTML locators change often and require you to update your test framework.

Logging in to third party sites using WebDriver at any point of your test increases the risk of your test failing because it makes your test longer. A general rule of thumb is that longer tests are more fragile and unreliable.

Upvotes: 1

Related Questions