Reputation: 5
I'm trying to webscrape a site but I don't understand why col[5]
is outside of the index. Any idea how I'm meant to add different columns to the dataframe as at the moment I'm stuck with only 1.
url = "https://www.metasrc.com/5v5/6.24/stats"
html_data = requests.get(url).text
soup = BeautifulSoup(html_data,"html5lib")
tables = soup.find_all("tbody")
df = pd.DataFrame(columns=["champ","winrate"])
for k in tables[0].find_all("tr"):
col = k.find_all("td")
champ =col[0]
winrate=col[5]
df = df.append({"champ":champ, "winrate" : winrate}, ignore_index=True)
Upvotes: 0
Views: 81
Reputation: 153460
Let's try a different approach using pd.read_html.
import requests
import pandas as pd
url = "https://www.metasrc.com/5v5/6.24/stats"
html_data = requests.get(url).text
dfs = pd.read_html(html_data)
print(dfs[0].head())
print(dfs[1].head())
Output:
Name Role Tier Score Trend Win % Role % Pick % Ban % KDA
0 AatroxAatrox TOP Weak / C 61.49 999NEW 46.93% 64.39% 0.62% 0.71% 1.97
1 AatroxAatrox JUNGLE Weak / C 59.49 999NEW 44.92% 30.61% 0.29% 0.71% 2.19
2 AhriAhri MID Strong / S 78.27 999NEW 53.22% 97.08% 13.53% 1.79% 2.67
3 AkaliAkali TOP Fair / B 63.56 999NEW 48.43% 35.00% 1.21% 1.83% 1.95
4 AkaliAkali MID Fair / B 63.19 999NEW 47.50% 56.18% 1.95% 1.83% 2.06
0 1
0 Lee Sin 91.63
1 Vayne 87.10
2 Caitlyn 84.73
3 Ezreal 83.54
4 Thresh 82.65
Upvotes: 1