Reputation: 2605
Given a Pandas dataframe of the form
January-2021,0.294
February-2021,0.252
March-2021,0.199
...
January-2022,0.384
February-2022,0.333
March-2022,0.271
...
how do I transform it to a dataframe with 12 columns, one for each month, so it looks like
year,January,February,March,...
2021,0.294,0.252,0.199,...
2022,0.384,0.333,0.271,...
Upvotes: 0
Views: 198
Reputation: 800
I would use pivot
to get the desired output.
```
# prepare dataframe
data = [('January-2021',0.294), ('February-2021',0.252), ('March-2021',0.199),
('January-2022',0.384), ('February-2022',0.333), ('March-2022',0.271) ]
df = pd.DataFrame(data, columns=['date', 'val'])
# extract month and year
df['month'] = df['date'].apply(lambda x: x.split('-')[0])
df['year'] = df['date'].apply(lambda x: x.split('-')[1])
df.drop(['date'], axis=1, inplace=True)
# pivoting the dataframe
df = df.pivot(index='year', columns='month')
# rename the columns
df.columns = [c[1] for c in df.columns]
df = df.reset_index()
```
At the end you will the the following result:
Upvotes: 0
Reputation: 24049
You can use str.split
then use pd.pivot
.
# January-2021, ... exists in 'Date' column.
df[['Month', 'Year']] = pd.DataFrame(df['Date'].str.split('-').to_list())
# 0.294, ... exists in 'Value' column
df_new = df.pivot(index='Year', columns='Month', values='Value')
print(df_new)
Output:
Month February January March
Year
2021 0.252 0.294 0.199
2022 0.333 0.384 0.271
Upvotes: 1
Reputation: 150725
You can do:
# `month-year` is name of date column
dates = df['month-year'].str.extract('(?P<month>\w+)-(?P<year>\d+)')
# `data` is name of data column
pd.crosstab(dates['year'], dates['month'], df['data'], aggfunc='first')
Output:
month February January March
year
2021 0.252 0.294 0.199
2022 0.333 0.384 0.271
Upvotes: 3