Reputation: 5282

I cannot get the data from a dataframe

I am trying the following code:

import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt

df_canada = pd.read_excel(
    "./Canada.xlsx",
    sheet_name = "Canada by Citizenship",
    skiprows= range(20),
    skipfooter=2)

years = list(map(str, range(1980, 2014)))
serie = df_canada.loc['Haiti', years].plot(kind='line')

But I get the following error:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'Haiti'

In order to solve this problem I put the code of the following way:

...
years = list(map(str, range(1980, 2014)))
df_canada.set_index('Country', inplace=True)
serie = df_canada.loc['Haiti', years].plot(kind='line')
...

But now I get the following error:

KeyError: "None of [Index(['1980', '1981', '1982', '1983', '1984', '1985', '1986', '1987', '1988',\n '1989', '1990', '1991', '1992', '1993', '1994', '1995', '1996', '1997',\n '1998', '1999', '2000', '2001', '2002', '2003', '2004', '2005', '2006',\n
'2007', '2008', '2009', '2010', '2011', '2012', '2013'],\n
dtype='object')] are in the [index]"

Canda.columns:

Index([ 'Type', 'Coverage', 'AREA', 'AreaName', 'REG', 'RegName', 'DEV', 'DevName', 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013], dtype='object')

And of course this index exist in the xlsx file.

Any idea?

Thanks

Upvotes: 0

Answers (2)

Nasimi Taghizade

Reputation: 1

it works..

import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd  # For reading the data

# Read the dataset skipping top 20 rows(irrelevant) and second last row
df_canada = pd.read_excel("./canada.xlsx", skiprows=range(20), skipfooter=2)
#print(df_canada.head().to_string())  # View top 5 rows
df_canada.set_index('OdName', inplace = True)

# Change years to be integers instead of strings
years = list(range(1980, 2014))

# Simply sum across the years using our list of years instead of manually typing each year
df_canada['Total'] = df_canada[years].sum(axis=1)

df_canada.loc['Haiti', years].plot(kind='line')

plt.title("immigration from Haiti")
plt.ylabel('Number of immigrants')
plt.xlabel('Years')

plt.show()

Upvotes: 0

user14405165

Reputation:

import pandas as pd

df_canada = pd.read_excel('Canada.xlsx', sheet_name = 'Canada by Citizenship (2)') 


df_canada.set_index('OdName', inplace = True)

import matplotlib.pyplot as plt

# Change years to be integers instead of strings
years = list(range(1980, 2014)) 

# Simply sum across the years using our list of years instead of manually typing each year
df_canada['Total'] = df_canada[years].sum(axis=1) 


df_canada.loc['Haiti', years].plot(kind = 'line')

plt.title('Immigration from Albania')
plt.ylabel('Number of Immigrants')
plt.xlabel('years')``

plt.show()

Upvotes: 1

I cannot get the data from a dataframe

Answers (2)

Related Questions