Reputation: 43
I am working with a CSV data set called newdata.csv that has the Corruption Perceptions Index ratings for 186 countries from the years 1997 to 2015. Here is a small snapshot of it:
Here, the years stretch out to the right. I want to put the data into long format that I can work with in python. to do this, I tried to use the melt function of pandas.This code:
import pandas as pd
data = pd.read_csv("newdata.csv", encoding = "ISO-8859-1")
a = pd.melt(data, id_vars='Country', value_vars=[1997, 1998, 1999,
2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011,
2012, 2013, 2014, 2015])
Then I export it as a new CSV that i can look at in excel, with this code:
a.to_csv('Pythonoutput.csv', sep=',')
When I do this, I get the country column and the year column as I want them, but not any of the data. Here is my result:
How do I get my CPI values to display downward as well as the years?
Upvotes: 2
Views: 211
Reputation: 6213
First, read your data
df = pd.read_csv("newdata.csv", encoding = "ISO-8859-1")
The index of your data is the Country
, so let the dataframe know it via set_index()
df = df.set_index('Country')
To stack the colums all together, use stack()
df = df.stack()
By stacking, the DataFrame automatically got a MultiIndex with 2 levels. Finally rename the index. The DataFrame could infer Country
as name for the first level, but the second level doesn't have a label yet:
df.index.names = ['Country', 'Year']
And now you're ready to export it via to_csv()
. Give the series a name with the header
-argument
df.to_csv("Pythonoutput.csv", header=['CPI'])
Putting it all together:
df = pd.read_csv("newdata.csv", encoding = "ISO-8859-1").set_index('Country').stack()
df.index.names = ['Country', 'Year']
df.to_csv("Pythonoutput.csv", header=['CPI'])
Upvotes: 1