Reputation: 3161
I have a column within a dataframe that I need to update if another column is empty. The column is 'subscriberkey' and already has values in it. I need to update these value with a string + number. My intention is not to create a duplicate column.
The value needs to be unique, hence why I initially thought that appending the string+ a number would be the way to go.
Age Email Subscriberkey
10 [email protected] giririfndfieir
23 kfkkfkfffrrrc
64 [email protected] ifiririieiriei
for the second row, I would want the subscriberkey to be string+number+string So far, I have tried the following:
df.loc[df.Email == NULL, 'subscriberkey']= 'string'+.cumcount()+1+'string'
I will appreciate pointers on how best to achieve this.
Upvotes: 4
Views: 2179
Reputation: 294328
consider df
df = pd.DataFrame(dict(EMAIL_ACQ_DT=['key1', None, 'key2', None, 'ke3', 'key4', None, None]))
print(df)
EMAIL_ACQ_DT
0 key1
1 None
2 key2
3 None
4 ke3
5 key4
6 None
7 None
fill_keys = df.groupby(df.EMAIL_ACQ_DT.isnull()).cumcount().apply('key{}_'.format)
df['subscriberkey'] = df.EMAIL_ACQ_DT.fillna(fill_keys)
print(df)
EMAIL_ACQ_DT subscriberkey
0 key1 key1
1 None key0_
2 key2 key2
3 None key1_
4 ke3 ke3
5 key4 key4
6 None key2_
7 None key3_
Upvotes: 0
Reputation: 214967
You may try something like this:
nullCond = df.Email.isnull()
# or nullCond = (df.Email == "") it those are empty strings
df.loc[nullCond, 'Subscriberkey'] = "string" + nullCond[nullCond].cumsum().astype(str) + "string"
Upvotes: 4