remington howell
remington howell

Reputation: 170

Most efficient way to access Pandas dataframes?

I have a dataframe in Python that I want to trace through in a very specific way and I'm very new to using Pandas so I need some advice on how best to do this. This dataframe has information on many, many video games released over the course of history. Each row is an entry for a particular video game and each column contains info such as game names, release years, sales numbers, and console platforms (the same game appears multiple times if released on multiple platforms).

I want to do some calculations on sales figures based on release consoles over particular dates. The most obvious way of doing this is, of course, manually looping over every row in the dataframe checking to see if entries match my particular requirements for a calculation.

This is how I plan to do my traversals:

for s in frame.iterrows():
    if s[1][1] == "Wii":
        print (s[1][1]) ##As a test, I can print out the names of Wii games

My question is if this is the "correct" or most efficient way to do this, which I assume it's not. Pandas seems to have a TON of useful methods for dataframes and I would like to know if it contains a more efficient method for only looking up data with certain prerequisites.

Upvotes: 0

Views: 340

Answers (1)

Peter Dolan
Peter Dolan

Reputation: 1423

assuming you want wii games an easy way to do this is the following. Let's take a toy dataframe example:

# Dataframe 'games':

  console       title
0    Xbox        Halo
1     Wii  Smash Bros

To get all the rows with wii games, you can run

games[games["console"] == "Wii"]

# returns 
console       title
1     Wii  Smash Bros

Hope this helps! Let me know if you have any follow up questions/want more detail

Upvotes: 1

Related Questions