SK13
SK13

Reputation: 13

Python conditional transform


companies_id     transaction_month     count    
    0                2020-10-01          3
    1                2020-10-01          5
    1                2020-11-01          5
    1                2020-12-01         18
    1                2021-01-01          8

I want the result to be like

companies_id      transaction_month     count       first_month
     0                2020-10-01          3           
     1                2020-10-01          5          2020-10-01 
     1                2020-11-01          5          2020-10-01 
     1                2020-12-01         18          2020-10-01 
     1                2021-01-01          8          2020-10-01 

This is my data set I want to add a new column called "first month" that should contain the value from transaction month column where the corresponding count is >=5.

for example : In case of companies_id 1:

first 5 or more transaction occurred on 2020-10-01 therefore "first month" column should contain 2020-10-01 throughout i.e to all rows with companies_id as 1.

Upvotes: 0

Views: 134

Answers (1)

jezrael
jezrael

Reputation: 863521

Use Series.where for replace transaction_month to NaN if not >=5 count and then use GroupBy.transform with GroupBy.first for first non missing values per groups to new column:

df['transaction_month'] = pd.to_datetime(df['transaction_month'])

print (df['transaction_month'].where(df['count'] >= 5))
0          NaT
1   2020-10-01
2   2020-11-01
3   2020-12-01
4   2021-01-01
Name: transaction_month, dtype: datetime64[ns]

df['first_month'] = (df['transaction_month'].where(df['count'] >= 5)
                                            .groupby(df['companies_id'])
                                            .transform('first'))

print (df)
   companies_id transaction_month  count first_month
0             0        2020-10-01      3         NaT
1             1        2020-10-01      5  2020-10-01
2             1        2020-11-01      5  2020-10-01
3             1        2020-12-01     18  2020-10-01
4             1        2021-01-01      8  2020-10-01

Upvotes: 1

Related Questions