Reputation: 5282
I am trying the following code:
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
df_canada = pd.read_excel(
"./Canada.xlsx",
sheet_name = "Canada by Citizenship",
skiprows= range(20),
skipfooter=2)
years = list(map(str, range(1980, 2014)))
serie = df_canada.loc['Haiti', years].plot(kind='line')
But I get the following error:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()
KeyError: 'Haiti'
In order to solve this problem I put the code of the following way:
...
years = list(map(str, range(1980, 2014)))
df_canada.set_index('Country', inplace=True)
serie = df_canada.loc['Haiti', years].plot(kind='line')
...
But now I get the following error:
KeyError: "None of [Index(['1980', '1981', '1982', '1983', '1984', '1985', '1986', '1987', '1988',\n '1989', '1990', '1991', '1992', '1993', '1994', '1995', '1996', '1997',\n '1998', '1999', '2000', '2001', '2002', '2003', '2004', '2005', '2006',\n
'2007', '2008', '2009', '2010', '2011', '2012', '2013'],\n
dtype='object')] are in the [index]"
Canda.columns:
Index([ 'Type', 'Coverage', 'AREA', 'AreaName', 'REG', 'RegName', 'DEV', 'DevName', 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013], dtype='object')
And of course this index exist in the xlsx file.
Any idea?
Thanks
Upvotes: 0
Views: 1321
Reputation: 1
it works..
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd # For reading the data
# Read the dataset skipping top 20 rows(irrelevant) and second last row
df_canada = pd.read_excel("./canada.xlsx", skiprows=range(20), skipfooter=2)
#print(df_canada.head().to_string()) # View top 5 rows
df_canada.set_index('OdName', inplace = True)
# Change years to be integers instead of strings
years = list(range(1980, 2014))
# Simply sum across the years using our list of years instead of manually typing each year
df_canada['Total'] = df_canada[years].sum(axis=1)
df_canada.loc['Haiti', years].plot(kind='line')
plt.title("immigration from Haiti")
plt.ylabel('Number of immigrants')
plt.xlabel('Years')
plt.show()
Upvotes: 0
Reputation:
import pandas as pd
df_canada = pd.read_excel('Canada.xlsx', sheet_name = 'Canada by Citizenship (2)')
df_canada.set_index('OdName', inplace = True)
import matplotlib.pyplot as plt
# Change years to be integers instead of strings
years = list(range(1980, 2014))
# Simply sum across the years using our list of years instead of manually typing each year
df_canada['Total'] = df_canada[years].sum(axis=1)
df_canada.loc['Haiti', years].plot(kind = 'line')
plt.title('Immigration from Albania')
plt.ylabel('Number of Immigrants')
plt.xlabel('years')``
plt.show()
Upvotes: 1