converting a year and month table into a pandas Series

Question

I have a number of files that look like this.

Year    Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1997    1.840%  -0.680% 0.480%  1.550%  1.510%  1.750%  2.630%  -0.190% 2.960%  2.180%  0.610%  0.710%
1998    -0.470% 1.270%  2.130%  1.200%  0.880%  1.790%  -0.800% -1.000% 1.080%  0.480%  0.710%  2.930%

Is there any way to convert files like this cleanly into a pandas Series?

ncocacola · Accepted Answer

I'm not sure whether your question includes parsing the files or not, so here it goes:

First, we parse the (csv) file, making sure to specify that it is whitespace-delimited:

df = pd.read_csv('data.csv', delim_whitespace=True)

delim_whitespace is nicer than sep=" ", because it interprets any number of successive whitespaces as a single delimeter.

Then, we melt the dataframe to merge the rows and columns together (i.e. 'Jan' column and '1997' row become a single 'Jan 1997' row with the correct percentage value).

 df = pd.melt(df, id_vars=["Year"], var_name="Month", value_name = "Percentage")

Now, we do some cleaning up: merging the 'Month' and 'Year' columns together, dropping the 'Year' column, parsing the strings as datetime and sorting by date.

df['Month'] = df.Month + " " + df.Year.map(str)
df = df.drop('Year', axis=1)
df["Month"] = pd.to_datetime(df.Month, format="%b %Y", dayfirst=True)
df = df.sort("Month")
df = df.set_index("Month")

Finally, we can convert our DataFrame into a Series:

series = df.ix[:,0]

The final result gives us the following Series:

Month
1997-01-01     1.840%
1997-02-01    -0.680%
1997-03-01     0.480%
...
1998-10-01     0.480%
1998-11-01     0.710%
1998-12-01     2.930%
Name: Percentage, dtype: object

Hope this helps!

converting a year and month table into a pandas Series

Answers (2)

Related Questions