Reputation: 27
Why do I only get the stats from the last player in PLAYER_NAME?
I would like to get the stats from all the players in PLAYER_NAME.
import csv
import requests
from bs4 import BeautifulSoup
import urllib
PLAYER_NAME = ["andy-murray/mc10", "rafael-nadal/n409"]
URL_PATTERN = 'http://www.atpworldtour.com/en/players/{}/player-stats?year=0&surfaceType=clay'
for item in zip (PLAYER_NAME):
url = URL_PATTERN.format(item)
response = requests.get(url)
html = response.content
soup = BeautifulSoup(html)
table = soup.find('div', attrs={'class': 'mega-table-wrapper'})
list_of_rows = []
for row in table.findAll('tr'):
list_of_cells = []
for cell in row.findAll('td'):
text = (cell.text.encode("utf-8").strip())
list_of_cells.append(text)
list_of_rows.append(list_of_cells)
outfile = open("./tennis.csv", "wb")
writer = csv.writer(outfile)
writer.writerow(["Name", "Stat"])
writer.writerows(list_of_rows)
Upvotes: 0
Views: 132
Reputation: 4862
As mentioned in the comments, you're recreating list_of_rows
every time. To fix that, you have to move it outside the for loop, and instead of appending to it, and turning it into a list of lists, extend it.
On a side note, you have a few other issues with your code:
zip
is redundant, and it actually ends up converting your names into tuples, which will cause incorrect formatting, you just want to iterate over PLAYER_NAME
, and while you're at it, maybe rename that to PLAYER_NAMES
(since it's a list of names)format
- in this case {0}
.PLAYER_NAMES = ["andy-murray/mc10", "rafael-nadal/n409"]
URL_PATTERN = 'http://www.atpworldtour.com/en/players/{0}/player-stats?year=0&surfaceType=clay'
list_of_rows = []
for item in PLAYER_NAMES:
url = URL_PATTERN.format(item)
response = requests.get(url)
html = response.content
soup = BeautifulSoup(html)
table = soup.find('div', attrs={'class': 'mega-table-wrapper'})
# for row in table.findAll('tr'):
# list_of_cells = []
# for cell in row.findAll('td'):
# text = (cell.text.encode("utf-8").strip())
# list_of_cells.append(text)
# list_of_rows.extend(list_of_cells) # Change to extend here
# Incidentally, the for loop above could also be written as:
list_of_rows += [
[cell.text.encode("utf-8").strip() for cell in row.findAll('td')]
for row in table.findAll('tr')
]
Upvotes: 2