Pandas Dataframe Not Working As Expected

Question

I'm using Pandas on a table at this link:

http://sports.yahoo.com/nfl/stats/byposition?pos=QB&conference=NFL&year=season_2014&sort=49&timeframe=All

I'm trying to create player objects out of each (relevant) row. So I want the 3rd row through the end, and I'm using a bunch of different fields to construct a player object, including name, team, passing yards, etc.

Here's my attempt:

def getAllQBs():
    QBs = []
    table = pd.read_html(requests.get(QB_LINK).content)[5]
    finalTable = table[2 : ]
    print(finalTable)

    for row in finalTable.iterrows():
        print(row)
        name = row[0]
        team = row[1]
        passingYards = row[7]
        passingTouchdowns = row[10]
        interceptions = row[11]
        rushingYards = row[13]
        rushingTouchdowns = row[16]
        rushingFumbles = row[19]
        newQB = QB(name, team, rushingYards, rushingTouchdowns, rushingFumbles, passingYards, passingTouchdowns, interceptions)
        QBs.append(newQB)
        print(newQB.toString())
    return QBs

Passing yards is the 8th element from the left in the row, so I thought I'd access it using row[7]. However, when I run this function, I get:

Traceback (most recent call last):
  File "main.py", line 66, in 
    main()
  File "main.py", line 64, in main
    getAllQBs()
  File "main.py", line 27, in getAllQBs
    passingYards = row[7]
IndexError: tuple index out of range

It looks like I'm inadvertently using columns. However, I used DataFrame.iterrows(), which I thought would take care of this...

Any ideas?

Thanks, bclayman

chrisb · Accepted Answer

iterrows() generates tuples of the form (index, Series), where Series is the row data you're trying access. In this case where your index isn't meaningful, you can unpack it to a dummy variable, like this.

for (_, row) in finalTable.iterrows():
    .....

Pandas Dataframe Not Working As Expected

Answers (1)

Related Questions