Reputation: 176
I would like to add new column to a DataFrame based on a condition.
df = pd.DataFrame({'category' : ['A', 'A', 'B', 'C'],
'value': [10,10,20,20]
})
On DataFrame above, I would like to add separated columns ['Value A'], ['Value B'], ['Value C'] to make calculations easier. Desired output would be like
df = pd.DataFrame({'category' : ['A', 'A', 'B', 'C'],
'value': [10,10,20,20]
'Value A' : [10, 10, 0, 0]
'Value B' : [0, 0, 20, 0]
'Value C' : [0, 0, 0, 20]
})
I thought I could just do it by
df['Value A'] = df['value'](add some condition here).fillna(0)
But seems like it can be done this simple. Is there any way to do that?
Upvotes: 2
Views: 227
Reputation: 150805
Let's try:
df.join(df.reset_index()
.pivot_table(index='index', columns='category',
values='value', fill_value=0)
.reindex(['A','B','C'], axis=1, fill_value=0) # from comment
.add_prefix('value')
)
Output:
category value valueA valueB valueC
0 A 10 10 0 0
1 A 10 10 0 0
2 B 20 0 20 0
3 C 20 0 0 20
Upvotes: 1
Reputation: 28729
You can create a dictionary pairing the new column names with the conditions:
condition = {f"Value {letter}": df.loc[df.category.eq(letter), "value"]
for letter in df.category.unique()}
condition
{'Value A': 0 10
1 10
Name: value, dtype: int64,
'Value B': 2 20
Name: value, dtype: int64,
'Value C': 3 20
Name: value, dtype: int64}
Update: Lovely suggestion from @QuangHoang - you can build the condition
dictionary with a groupby :
condition = {f"Value {k}": d for k, d in df.groupby("category")["value"]}
Now unpack the condition
via assign:
df.assign(**condition).fillna(0)
category value Value A Value B Value C
0 A 10 10.0 0.0 0.0
1 A 10 10.0 0.0 0.0
2 B 20 0.0 20.0 0.0
3 C 20 0.0 0.0 20.0
Upvotes: 2