Reputation: 185
Instructions given by Professor: 1. Using the list of countries by continent from World Atlas data, load in the countries.csv file into a pandas DataFrame and name this data set as countries. 2. Using the data available on Gapminder, load in the Income per person (GDP/capita, PPP$ inflation-adjusted) as a pandas DataFrame and name this data set as income. 3. Transform the data set to have years as the rows and countries as the columns. Show the head of this data set when it is loaded. 4. Graphically display the distribution of income per person across all countries in the world for any given year (e.g. 2000). What kind of plot would be best?
In the code below, I have some of these tasks completed, but I'm having a hard time understanding how to acquire data from a DataFrame row. I want to be able to acquire data from a row and then plot it. It may seem like a trivial concept, but I've been at it for a while and need assistance please.
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
countries = pd.read_csv('2014_data/countries.csv')
countries.head(n=3)
income = pd.read_excel('indicator gapminder gdp_per_capita_ppp.xlsx')
income = income.T
def graph_per_year(year):
stryear = str(year)
dfList = income[stryear].tolist()
graph_per_year(1801)
Upvotes: 0
Views: 356
Reputation: 116
To answer your first question, a bar graph with a year sector would be best. You'll have to keep countries on y axis and per capita income on y. And a dropdown perhaps to select a particular year for which the graph will change.
Upvotes: 0
Reputation: 738
Pandas uses three types of indexing.
If you are looking to use integer indexing, you will need to use .iloc
df_1
Out[5]:
consId fan-cnt
0 1155696024483 34.0
1 1155699007557 34.0
2 1155694005571 34.0
3 1155691016680 12.0
4 1155697016945 34.0
df_1.iloc[1,:] #go to the row with index 1 and select all the columns
Out[8]:
consId 1.155699e+12
fan-cnt 3.400000e+01
Name: 1, dtype: float64
And to go to a particular cell, you can use something along the following lines,
df_1.iloc[1][1]
Out[9]: 34.0
You need to go through the documentation for other types of indexing namely .ix
and .loc
as suggested by sohier-dane.
Upvotes: 1