Steffen H
Steffen H

Reputation: 23

How to get Series (column) from a dataframe with (column_name/row_name/1) as the new (date) index

Suppose, you have a dataframe

month  1  2  3
year          
2019   a  b  c
2020   d  e  f

what I want is a transformed dataframe where the row names (year) and column names (month) serve as the index:

           data
2019-01-01    a
2019-02-01    b
2019-03-01    c
2020-01-01    d
2020-02-01    e
2020-03-01    f

Is there an elegant way to do this in Pandas?

Minimal example to construct the dfs

# this builds the dataframe
import numpy as np
import pandas as pd
from datetime import date
df = pd.DataFrame(np.array([['a', 'b', 'c'], ['d', 'e', 'f']]), columns = [1, 2, 3], index = [2019, 2020])
df.columns.name = "month"
df.index.name = "year"

desired dataframe:

# this builds the desire dataframe
desired_index = [date(2019,1,1), date(2019,2,1), date(2019,3,1), date(2020,1,1), date(2020,2,1), date(2020,3,1)]
desired_df = pd.DataFrame(np.array(['a', 'b', 'c', 'd', 'e', 'f']), columns = ['data'], index = desired_index)

Upvotes: 2

Views: 40

Answers (1)

jezrael
jezrael

Reputation: 863301

Use DataFrame.stack with Series.to_frame to one column DataFrame and then convert flattened MultiIndex to datetimes in list comprehension with f-strings:

df = df.stack().to_frame('data')
df.index = pd.to_datetime([f'{y}-{m}-1' for y, m in df.index])
print (df)
           data
2019-01-01    a
2019-02-01    b
2019-03-01    c
2020-01-01    d
2020-02-01    e
2020-03-01    f

Alternative solution with DataFrame.melt and for correct order DataFrame.sort_index:

df = df.reset_index().melt('year', value_name='data')
df.index = pd.to_datetime(df[['year', 'month']].assign(day=1))
df = df[['data']].sort_index()
print (df)
           data
2019-01-01    a
2019-02-01    b
2019-03-01    c
2020-01-01    d
2020-02-01    e
2020-03-01    f

Upvotes: 2

Related Questions