Ebad Ali
Ebad Ali

Reputation: 600

Web Scraping: request not returning complete content of the webpage

I am writing a website scraper that saves all names of cryptocurrency from a table within a website. I wrote a script to get the response of the webpage and then by using the BeautifulSoup library to parse the response into an HTML object. The issue is the response is not returning the complete content of the webpage. It displays data from a certain position of the table and skips the data above it.

When I try debugging the code the response object has all the data from the webpage but when I try to print the data it only shows data from a certain point in the page.

Here is the code:

import requests
from bs4 import BeautifulSoup

response = requests.get("https://coinmarketcap.com/all/views/all", headers={'User-Agent': 'Mozilla/5.0'})
print(response.text)

soup = BeautifulSoup(response.text, 'html.parser')

results = soup.find_all('table', attrs={'id': 'currencies-all'})

It would be really helpful if someone could tell me what I am doing wrong because I am unable to find out the issue.

Upvotes: 2

Views: 1060

Answers (2)

bigbounty
bigbounty

Reputation: 17368

You are missing out one thing here. The table rows are nested within the table tag. Therefore, you need to first extract the table body then the table rows. I use 'lxml' parser.

import requests
from bs4 import BeautifulSoup

response = requests.get("https://coinmarketcap.com/all/views/all", headers={'User-Agent': 'Mozilla/5.0'})
print(response.text)

soup = BeautifulSoup(response.text, 'lxml')
results = soup.find('tbody')
curr_symbols = [x.text for x in results.find_all('td',attrs={'class':'text-left col-symbol'})]
print(curr_symbols)
print(len(curr_symbols)) # 1878

Upvotes: 0

Austin Mackillop
Austin Mackillop

Reputation: 1245

Is it possible that you are hitting the buffer limit of your IDE's console?

On Spyder, the default is 500 lines and you will only see 500 lines of the sourcecode as a result. Try increasing this limit to see if that solves your issue.

On Spyder (windows), it's Tools > Preferences > IPython Console > Buffer (at the bottom).

I increased my buffer to 4000 and it still wasn't enough to fit the entire page but it did reveal more lines.

Upvotes: 2

Related Questions