Cherie Wilson
Cherie Wilson

Reputation: 13

keep getting error with multiple pages when using df to excel works fine with 1 page

Here is my code everything works pretty good until I try and send to excel. I have a script that works fine for one web page but not multiple pages.

Working code and what I want:

import pandas as pd
from pandas import ExcelWriter


dfs = pd.read_html('https://www.teamrankings.com/nfl/stat/yards-per-play/',header=0)
for df in dfs:
    print(df)

writer = pd.ExcelWriter('nfl.xlsx')


df.to_excel('nflypp.xlsx', sheet_name='yppo', index=False, engine='xlsxwriter')

writer.save()

none working code:

import pandas as pd
from pandas import ExcelWriter


oyyp_df = pd.read_html('https://www.teamrankings.com/nfl/stat/yards-per-play.html',header=0)
dyyp_df = pd.read_html('https://www.teamrankings.com/nfl/stat/opponent-yards-per-play',header=0)
for df in (oyyp_df, dyyp_df):
    print(df)


writer = pd.ExcelWriter('nfl.xlsx') 


df.to_excel('nflypp.xlsx', sheet_name='yppo', index=False, engine='xlsxwriter')
df.to_excel('nflypp.xlsx', sheet_name='yppd', index=False, engine='xlsxwriter')


writer.save()

Working until it gets to the df.to_excel

error: AttributeError: 'list' object has no attribute 'to_excel'

Here is the out put

C:\Cabs\projects>nflstatsypp.py
[ Rank Team 2018 Last 3 Last 1 Home Away 2017
0 1 Kansas City 7.0 7.0 6.9 6.4 7.5 6.1
1 2 LA Chargers 6.8 6.4 6.2 6.6 6.9 5.9
2 3 LA Rams 6.7 6.2 5.4 7.0 6.4 5.8
3 4 Tampa Bay 6.5 6.3 5.3 6.3 6.8 5.6
4 5 New Orleans 6.2 6.0 3.6 6.7 5.7 6.3
5 6 Pittsburgh 6.2 6.0 5.3 6.2 6.2 5.8
6 7 Carolina 6.2 7.3 6.8 6.1 6.2 5.1
7 8 Atlanta 6.0 5.0 2.9 6.5 5.5 5.8
8 9 Green Bay 6.0 5.4 4.4 5.9 6.1 4.9
9 10 Denver 5.9 6.1 6.3 6.1 5.8 4.8
10 11 New England 5.9 6.2 6.6 6.2 5.5 6.0
11 12 NY Giants 5.8 6.2 5.0 5.4 6.1 4.9
12 13 Houston 5.7 6.0 5.2 6.2 5.3 5.0
13 14 Seattle 5.7 6.2 6.8 5.5 5.9 5.2
14 15 San Francisco 5.7 5.8 6.1 5.4 5.9 5.3
15 16 Indianapolis 5.7 5.7 3.7 6.2 5.1 4.6
16 17 Cincinnati 5.6 5.1 4.8 5.5 5.7 4.8
17 18 Minnesota 5.6 5.1 4.7 5.6 5.6 5.4
18 19 Oakland 5.5 5.3 6.4 6.2 5.0 5.4
19 20 Philadelphia 5.5 5.4 6.1 5.5 5.5 5.6
20 21 Chicago 5.5 4.6 4.9 6.0 5.0 4.9
21 22 Cleveland 5.4 7.3 8.2 5.1 5.8 4.9
22 23 Tennessee 5.4 7.1 7.5 5.8 5.0 5.2
23 24 Miami 5.4 4.7 3.5 5.8 4.9 4.9
24 25 Dallas 5.3 5.2 4.7 5.6 5.1 5.3
25 26 Detroit 5.3 5.0 4.8 5.2 5.5 5.5
26 27 Baltimore 5.2 5.4 4.8 5.3 5.2 4.6
27 28 Washington 5.2 4.8 5.6 5.0 5.4 5.3
28 29 Jacksonville 5.0 4.3 3.8 5.0 5.1 5.4
29 30 NY Jets 4.9 4.5 4.3 5.4 4.4 5.0
30 31 Buffalo 4.5 6.2 6.3 4.5 4.6 4.7
31 32 Arizona 4.4 4.8 5.5 4.5 4.2 4.7]
[ Rank Team 2018 Last 3 Last 1 Home Away 2017
0 1 Baltimore 4.6 4.1 2.9 4.5 4.8 5.0
1 2 Buffalo 4.9 4.2 3.5 5.1 4.7 5.3
2 3 Chicago 4.9 4.8 5.0 4.6 5.2 5.1
3 4 Pittsburgh 5.2 5.1 6.2 5.6 4.8 5.3
4 5 Dallas 5.3 5.2 3.6 4.9 5.6 5.1
5 6 Minnesota 5.3 5.4 6.6 4.6 5.9 4.8
6 7 Arizona 5.3 5.1 4.4 5.0 5.6 4.9
7 8 Jacksonville 5.3 5.6 7.5 4.3 6.2 4.8
8 9 Houston 5.4 6.1 8.2 5.9 4.9 5.7
9 10 Tennessee 5.4 5.1 3.8 5.0 5.7 5.1
10 11 LA Chargers 5.5 5.1 5.3 5.7 5.4 5.3
11 12 Indianapolis 5.5 4.8 3.9 5.6 5.4 5.7
12 13 Green Bay 5.5 5.7 5.5 5.2 5.8 5.5
13 14 San Francisco 5.6 5.9 6.8 5.1 5.8 5.3
14 15 New England 5.7 5.4 4.7 5.4 5.9 5.7
15 16 NY Jets 5.7 6.8 6.7 6.0 5.4 5.4
16 17 Cleveland 5.7 5.3 5.2 6.0 5.5 5.1
17 18 Carolina 5.8 5.5 5.3 5.8 5.8 5.4
18 19 Washington 5.8 5.8 6.1 5.7 5.9 5.3
19 20 NY Giants 5.8 6.0 4.9 5.7 6.0 5.7
20 21 Denver 5.9 6.2 4.8 6.0 5.7 4.9
21 22 New Orleans 5.9 4.8 4.7 6.1 5.8 5.4
22 23 Kansas City 6.0 5.4 6.4 5.4 6.4 5.6
23 24 Philadelphia 6.1 7.0 5.6 5.7 6.6 5.2
24 25 Detroit 6.1 5.6 5.4 5.9 6.4 5.5
25 26 LA Rams 6.1 6.4 4.8 6.4 5.8 5.3
26 27 Seattle 6.1 7.2 6.1 6.7 5.8 4.9
27 28 Atlanta 6.2 5.1 4.8 6.4 5.9 5.2
28 29 Cincinnati 6.2 5.7 6.3 6.2 6.2 5.0
29 30 Miami 6.3 6.7 6.3 6.1 6.5 5.4
30 31 Tampa Bay 6.4 6.4 6.8 5.8 7.1 6.0
31 32 Oakland 6.6 6.2 6.9 6.5 6.6 5.6]
Traceback (most recent call last):
File "C:\Cabs\projects\nflstatsypp.py", line 14, in
df.to_excel('nflypp.xlsx', sheet_name='yppo', index=False, engine='xlsxwriter')
AttributeError: 'list' object has no attribute 'to_excel'

One last ? how do you clean up the above second table so the headers are lined up like the first table? if it has been answered add link please. Thanks. Note when printed out in python the first table headers are correct just for clarification. thanks again. no more edits. hope all this helps.

I'm brand new, just having fun. Have been researching for months with all different codes. have about 15.py trying to get this to work.

Thanks for any help. if the answer is out there I can't find or understand it. :-) finally. again sorry for being such a newbe. LOL

Upvotes: 1

Views: 82

Answers (1)

chitown88
chitown88

Reputation: 28650

There's a few ways you can do this. I would probably loop it condense the code a bit, save each dataframe as you iterate in your for loop. But it also looks like you want different names for your sheets, which would involve creating a variable in same way to associate each of your pd.read_htmls, and it appears you're a beginner, so we'll just try to keep this as simple as possible, and we'll do it another way which is just straight away save the data.

First off, when you do oyyp_df = pd.read_html('https://www.teamrankings.com/nfl/stat/yards-per-play.html',header=0), it's storing it as dataframe, BUT packaging it into a list (see here).

Also, it would be benificial to go back and read up about lists in Python. So your for loop iterates through those items in each of your lists (oyyp_df, dyyp_df).

If you want to call a specific item in a list, you call it by it's index/position. The key to note though, is that the index starts at 0. So the first item in a list is at position 0, the 2nd item is at position 1, etc.

a_list = ['first item', 'sencond item, 'third item']

to call that first item, you'd type a_list[0] and you would see the output 'first item'.

Now a list can be of many data types. It could be strings, like above, it can integers, it can be dictionaries, or in your case here, it's dataframes.

so oyyp_df is really = [<your DATFRAME>, <maybe a 2nd dataframe>, etc.]. yours only contains 1 item, in the first position. So you get that error. lists can't do .to_excel, but dataframes can.

What we can do is store that 1st item dataframe though by setting that equal to another name (or you could actually use the same name...but be careful as if your list has other items in it, you'd lose those); oyyp_df = oyyp_df[0]

I changed a couple things to hopefully make it clearer in your code below.

import pandas as pd


html_data1 = pd.read_html('https://www.teamrankings.com/nfl/stat/yards-per-play.html',header=0)
html_data2 = pd.read_html('https://www.teamrankings.com/nfl/stat/opponent-yards-per-play',header=0)


for df in (html_data1, html_data2):
    print(df)


oyyp_df = html_data1[0]
dyyp_df = html_data2[0]


writer = pd.ExcelWriter('nflypp.xlsx')

oyyp_df.to_excel(writer, sheet_name='yppo', index=False)
dyyp_df.to_excel(writer, sheet_name='yppd', index=False)

writer.save()
writer.close()

Upvotes: 0

Related Questions