Reputation:
After transposing my Python Dataframe, I could not access my column name to plot a graph. I want to choose two columns but failed. It keeps saying no such column names. I am pretty new to Python, dataframe and transpose. Could someone help please?
Below is my input file and I want to transpose row to Column. It was successful when I transposed but I could not select "Canada" and "Cameroon" to plot a graph.
country 1990 1991 1992 1993 1994 1995
0 Cambodia 65.4 65.7 66.2 66.7 67.1 68.4
1 Cameroon 63.9 63.7 64.7 65.6 66.6 67.6
2 Canada 98.6 99.6 99.6 99.8 99.9 99.9
3 Cape Verde 77.7 77.0 76.6 89.0 79.0 78.0
import pandas as pd
import numpy as np
import re
import math
import matplotlib.pyplot as plt
missing_values=["n/a","na","-","-","N/A"]
df = pd.read_csv('StackoverflowGap.csv', na_values = missing_values)
# Transpose
df = df.transpose()
plt.figure(figsize=(12,8))
plt.plot(df['Canada','Cameroon'], linewidth = 0.5)
plt.title("Time Series for Canada")
plt.show()
It produces a long list of error messages but the final message is
KeyError: ('Canada', 'Cameroon')
Upvotes: 4
Views: 2836
Reputation: 662
There a few things you might need to do when working with the data.
df = pd.read_csv('StackoverflowGap.csv', na_values = missing_values, header = None)
.df.columns= df.iloc[0]
.df = df.reindex(df.index.drop(0))
.plt.plot()
command) you need to use df[]
on the list of columns, i.e. df[['Canada', 'Cameroon']]
.EDIT So the code, as it works for me is as follows
df = pd.read_csv('StackoverflowGap.csv', na_values = missing_values, header = None)
df = df.T
df.columns= df.iloc[0]
df = df.reindex(df.index.drop('country'))
df.index.name = 'Year'
plt.figure(figsize=(12,8))
plt.plot(df[['Canada','Cameroon']], linewidth = 0.5)
plt.title("Time Series for Canada")
plt.show()
Upvotes: 5