Reputation: 1998
I need some help to rearrange a dataframe. this is what the data looks like.
Year item 1 item 2 item 3
2001 22 54 33
2002 77 54 33
2003 22 NaN 33
2004 22 54 NaN
The layout I want is:
Items Year Value
item 1 2001 22
item 1 2002 77
...
And so on...
Upvotes: 2
Views: 234
Reputation: 294478
Using comprehensions and pd.DataFrame.itertuples
pd.DataFrame(
[[y, i, v]
for y, *vals in df.itertuples(index=False)
for i, v in zip(df.columns[1:], vals)
if pd.notnull(v)],
columns=['Year', 'Item', 'Value']
)
Year Item Value
0 2001 item 1 22.0
1 2001 item 2 54.0
2 2001 item 3 33.0
3 2002 item 1 77.0
4 2002 item 2 54.0
5 2002 item 3 33.0
6 2003 item 1 22.0
7 2003 item 3 33.0
8 2004 item 1 22.0
9 2004 item 2 54.0
Upvotes: 0
Reputation: 863116
Use melt
if not necessary remove NaN
s:
df = df.melt('Year', var_name='Items', value_name='Value')
print (df)
Year Items Value
0 2001 item 1 22.0
1 2002 item 1 77.0
2 2003 item 1 22.0
3 2004 item 1 22.0
4 2001 item 2 54.0
5 2002 item 2 54.0
6 2003 item 2 NaN
7 2004 item 2 54.0
8 2001 item 3 33.0
9 2002 item 3 33.0
10 2003 item 3 33.0
11 2004 item 3 NaN
For remove NaN
s add dropna
:
df = df.melt('Year', var_name='Items', value_name='Value').dropna(subset=['Value'])
print (df)
Year Items Value
0 2001 item 1 22.0
1 2002 item 1 77.0
2 2003 item 1 22.0
3 2004 item 1 22.0
4 2001 item 2 54.0
5 2002 item 2 54.0
7 2004 item 2 54.0
8 2001 item 3 33.0
9 2002 item 3 33.0
10 2003 item 3 33.0
For a bit different ordering with removing NaN
s is possible use set_index
+ stack
+ rename_axis
+ reset_index
:
df = df.set_index('Year').stack().rename_axis(['Year','Items']).reset_index(name='Value')
print (df)
Year Items Value
0 2001 item 1 22.0
1 2001 item 2 54.0
2 2001 item 3 33.0
3 2002 item 1 77.0
4 2002 item 2 54.0
5 2002 item 3 33.0
6 2003 item 1 22.0
7 2003 item 3 33.0
8 2004 item 1 22.0
9 2004 item 2 54.0
Upvotes: 2