Reputation: 5
I have a DataFrame like this:
category uid sales_1 sales_2
0 Grocery 1 XX XX
1 Grocery 2 XX ZZ
2 Sports 3 XX ZZ
3 Grocery 4 ZZ XX
4 Beauty 5 ZZ ZZ
5 Beauty 6 ZZ ZZ
6 Sports 7 ZZ XX
7 Grocery 8 ZZ XX
...
I need to compare sales_1 column with sales_2 column. The result of comparison would be reflected in 2 new columns first and second. If sales_1 == sales_2 then values in theese 2 new columns should be 'no changes' and 'OK'. If sales_1 != sales_2 the values should be 'changed' and 'gap'. In the end I would like to have a following DataFrame:
category uid sales_1 sales_2 first second
0 Grocery 1 XX XX no changes OK
1 Grocery 2 XX ZZ changed gap
2 Sports 3 XX ZZ changed gap
3 Grocery 4 ZZ XX changed gap
4 Beauty 5 ZZ ZZ no changes OK
5 Beauty 6 ZZ ZZ no changes OK
6 Sports 7 ZZ XX changed gap
7 Grocery 8 ZZ XX changed gap
...
I would really appreciate any suggestion.
Upvotes: 0
Views: 51
Reputation: 326
You can use the where()
function from numpy:
df['first'] = np.where(df.sales_1 == df.sales_2, 'no changes', 'changed')
df['second'] = np.where(df.sales_1 == df.sales_2, 'OK', 'gap')
Upvotes: 1
Reputation: 448
You can first assign a default value to first
and second
columns and then apply filtering by the condition whether sales changed.
import pandas as pd
df = pd.DataFrame(
{
'category': ['Grocery', 'Sports', 'Beauty'],
'sales_1': ['XX', 'ZZ', 'XX'],
'sales_2': ['XX', 'XY', 'ZZ'],
}
)
changed_sales = df['sales_1'] != df['sales_2']
df['first'] = 'no changes'
df.loc[changed_sales, 'first'] = 'changed'
df['second'] = 'OK'
df.loc[changed_sales, 'second'] = 'gap'
print(df)
Output
category sales_1 sales_2 first second
0 Grocery XX XX no changes OK
1 Sports ZZ XY changed gap
2 Beauty XX ZZ changed gap
Upvotes: 1
Reputation: 1788
you can use list comprehension
df['first']= ["no changes" if s1 == s2 else "changed" for (s1, s2) in zip(df['sales_1'], df['sales_2']) ]
df['second'] = ["OK" if s1 == s2 else "gap" for (s1, s2) in zip(df['sales_1'], df['sales_2']) ]
Upvotes: 0