Reputation: 820
I have a dataframe like so:
number type X Y Z
1 red 101 NaN 101,NaN
2 blue 101 40,50 101,40,50
3 green 401 70,NaN 101,70,NaN
Is there any way that I can flag each type
element based on the column Z
? Z
is just a combined version of X
and Y
.
So something like:
Number Type Z X_or_Y
1 red 101 X
1 red NaN Y
2 blue 101 X
2 blue 40 Y
2 blue 50 Y
3 green 101 X
3 green 70 Y
3 green NaN Y
I'm thinking of the melt function but not sure how.
reshaped_data = pd.melt(df, id_vars =['number', 'type',
'Z'], value_vars =['X', 'Y'])
**Edit: ** The values in the X_or_Y column should come from the X ans Y columns in the initial dataframe
Upvotes: 1
Views: 124
Reputation: 862591
Solution for pandas under 0.24+ with DataFrame.melt
first, then Series.str.split
and DataFrame.stack
for explode splitted values by ,
:
df1 = df.melt(id_vars =['number', 'type'], value_vars =['X', 'Y'], var_name='Flag')
s = (df1.pop('value').astype(str).str.split(',', expand=True)
.stack()
.reset_index(level=1, drop=True)
.rename('Z'))
df1 = df1.join(s).sort_values('number').reset_index(drop=True)
print (df1)
number type Flag Z
0 1 red X 101
1 1 red Y nan
2 2 blue X 101
3 2 blue Y 40
4 2 blue Y 50
5 3 green X 401
6 3 green Y 70
7 3 green Y NaN
Solution above pandas 0.24 with DataFrame.explode
:
df1 = df.melt(id_vars =['number', 'type'], value_vars =['X', 'Y'], var_name='Flag')
df1 = (df1.assign(value = df1['value'].astype(str).str.split(','))
.explode('value')
.sort_values('number')
.reset_index(drop=True))
print (df1)
number type Flag value
0 1 red X 101
1 1 red Y nan
2 2 blue X 101
3 2 blue Y 40
4 2 blue Y 50
5 3 green X 401
6 3 green Y 70
7 3 green Y NaN
Upvotes: 2