Matt S
Matt S

Reputation: 5

IndexError: list index out of range - Beautiful Soup

I'm trying to webscrape a site but I don't understand why col[5] is outside of the index. Any idea how I'm meant to add different columns to the dataframe as at the moment I'm stuck with only 1.

url = "https://www.metasrc.com/5v5/6.24/stats"
html_data  = requests.get(url).text

soup = BeautifulSoup(html_data,"html5lib")
tables = soup.find_all("tbody")


df = pd.DataFrame(columns=["champ","winrate"])

for k in tables[0].find_all("tr"):
    col = k.find_all("td")
    champ =col[0]
    winrate=col[5]
    df = df.append({"champ":champ, "winrate" : winrate}, ignore_index=True)

Upvotes: 0

Views: 81

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

Let's try a different approach using pd.read_html.

import requests
import pandas as pd

url = "https://www.metasrc.com/5v5/6.24/stats"
html_data  = requests.get(url).text

dfs = pd.read_html(html_data)

print(dfs[0].head())
print(dfs[1].head())

Output:

           Name    Role        Tier  Score   Trend   Win %  Role %  Pick %  Ban %   KDA
0  AatroxAatrox     TOP    Weak / C  61.49  999NEW  46.93%  64.39%   0.62%  0.71%  1.97
1  AatroxAatrox  JUNGLE    Weak / C  59.49  999NEW  44.92%  30.61%   0.29%  0.71%  2.19
2      AhriAhri     MID  Strong / S  78.27  999NEW  53.22%  97.08%  13.53%  1.79%  2.67
3    AkaliAkali     TOP    Fair / B  63.56  999NEW  48.43%  35.00%   1.21%  1.83%  1.95
4    AkaliAkali     MID    Fair / B  63.19  999NEW  47.50%  56.18%   1.95%  1.83%  2.06
         0      1
0  Lee Sin  91.63
1    Vayne  87.10
2  Caitlyn  84.73
3   Ezreal  83.54
4   Thresh  82.65

Upvotes: 1

Related Questions