Newlyn Erratt
Newlyn Erratt

Reputation: 107

Is it possible to reshape pandas DataFrame using parts of column names?

I am just getting started with working in pandas and with dataframes. I'd like to reshape some data but I'm not sure the best approach to do so. My instinct says to iterate over the frame but I'm hoping there is some better way.

So, I have an initial dataframe that looks like this:

vendor_state client_state date total_widget_a_purchases total_widget_b_purchases
CA WA 2021-02-01 10 5
CA OR 2021-02-01 8 7
NY NJ 2021-03-07 15 9
NY NJ 2021-02-08 7 25
NY NY 2021-02-08 24 3

I would like to get it to the following state:

vendor_state client_state widget type 2021-02-01 2021-02-08 2021-03-07
CA WA widget_a 10 0 0
CA WA widget_b 5 0 0
NY NJ widget_a 0 7 15
NY NJ widget_b 0 25 9
NY NY widget_a 0 24 0
NY NY widget_b 0 3 0

There are two areas I'm really struggling with here.

CA/WA/2021-02-01 CA/WA/2021-02-08 CA/WA/2021-03-07

I am hoping that I am just missing something elementary due to not having worked with pandas in the past.

Upvotes: 3

Views: 166

Answers (1)

Nk03
Nk03

Reputation: 14949

via stack unstack:

df = (df.set_index(['vendor_state','client_state','date'])
 .stack()
 .unstack(2)
 .reset_index()
 .rename(columns={'level_2': 'widget type'})
 .fillna(0)
 )
df['widget type'] = df['widget type'].str.extract(pat = ("(widget_[a|b])"))

Output:

vendor_state client_state widget type 2021-02-01 2021-02-08 2021-03-07
0 CA OR widget_a 8.0 0.0 0.0
1 CA OR widget_b 7.0 0.0 0.0
2 CA WA widget_a 10.0 0.0 0.0
3 CA WA widget_b 5.0 0.0 0.0
4 NY NJ widget_a 0.0 7.0 15.0
5 NY NJ widget_b 0.0 25.0 9.0
6 NY NY widget_a 0.0 24.0 0.0
7 NY NY widget_b 0.0 3.0 0.0

Upvotes: 4

Related Questions