Reputation: 11
So I'm trying to scrape the company names and stock prices from the "Most Active" section from URL in the code below and store it all in a list. I was thinking of using a loop so it'll grab all of them, but it's only getting the first company and its stock price for some reason. It seems that regardless of if I get it to search for 'tbody' or not, it'll return the same single company and price instead of looping through that entire section of the webpage. Any help would be greatly appreciated.
import requests
from bs4 import BeautifulSoup
stock_list = []
url='https://markets.on.nytimes.com/research/markets/overview/overview.asp'
response = requests.get(url)
if not response.status_code == 200:
print(respose.status_code)
results_page = BeautifulSoup(response.content,'lxml')
key_data=results_page.find('table',class_="stock-spotlight-table",id="summ_vol+")
key_data2=key_data.find_all('tbody')
def pull_active(url):
for i in key_data2:
label = i.find('a', class_='truncateMeTo1').get_text()
value = i.find('td', class_='colPrimary').get_text()
stock_list.append((label, value))
print(stock_list)
pull_active(url)
Upvotes: 1
Views: 212
Reputation: 11
Basing off of the answer by @Barmar , I was able to get a slightly different solution to this as well.
def pull_active(url):
for i in key_data2:
for td in i.find_all('td', class_='colText'):
label = td.find('a', class_='truncateMeTo1').get_text()
value = i.find('td', class_='colPrimary').get_text()
stock_list.append((label, value))
print(stock_list)
Upvotes: 0
Reputation: 696
import requests
from bs4 import BeautifulSoup
stock_list = []
url='https://markets.on.nytimes.com/research/markets/overview/overview.asp'
response = requests.get(url)
if not response.status_code == 200:
print(respose.status_code)
results_page = BeautifulSoup(response.content,'lxml')
key_data=results_page.find('table',class_="stock-spotlight-table",id="summ_vol+")
key_data2=key_data.find('tbody').find_all('tr')
def pull_active(url):
for i in key_data2:
label = i.find('a', class_='truncateMeTo1').get_text()
value = i.find('td', class_='colPrimary').get_text()
stock_list.append((label, value))
print(stock_list)
pull_active(url)
key_data2=key_data.find_all('tbody')
This is the line that is causing issues in your solution. table row represents each item. So you need to find all the rows and iterate throught that
Upvotes: 0
Reputation: 780724
you're looping over all the tables, but not looping over all the items in each table.
def pull_active(url):
for i in key_data2:
for td in i.findall('td', class_='colText'):
label = td.find('a', class_='truncateMeTo1')
value = td.find('td', class_='colPrimary')
if a and col:
stock_list.append((label.get_text(), value.get_text()))
print(stock_list)
Upvotes: 1