0xsegfault
0xsegfault

Reputation: 3161

Adding String + Auto Increment - pandas, python

I have a column within a dataframe that I need to update if another column is empty. The column is 'subscriberkey' and already has values in it. I need to update these value with a string + number. My intention is not to create a duplicate column.

The value needs to be unique, hence why I initially thought that appending the string+ a number would be the way to go.

Age Email            Subscriberkey
10  [email protected]  giririfndfieir
23                   kfkkfkfffrrrc
64  [email protected]   ifiririieiriei    

for the second row, I would want the subscriberkey to be string+number+string So far, I have tried the following:

 df.loc[df.Email == NULL, 'subscriberkey']= 'string'+.cumcount()+1+'string'

I will appreciate pointers on how best to achieve this.

Upvotes: 4

Views: 2179

Answers (2)

piRSquared
piRSquared

Reputation: 294328

consider df

df = pd.DataFrame(dict(EMAIL_ACQ_DT=['key1', None, 'key2', None, 'ke3', 'key4', None, None]))
print(df)

  EMAIL_ACQ_DT
0         key1
1         None
2         key2
3         None
4          ke3
5         key4
6         None
7         None

fill_keys = df.groupby(df.EMAIL_ACQ_DT.isnull()).cumcount().apply('key{}_'.format)
df['subscriberkey'] = df.EMAIL_ACQ_DT.fillna(fill_keys)
print(df)

  EMAIL_ACQ_DT subscriberkey
0         key1          key1
1         None         key0_
2         key2          key2
3         None         key1_
4          ke3           ke3
5         key4          key4
6         None         key2_
7         None         key3_

Upvotes: 0

akuiper
akuiper

Reputation: 214967

You may try something like this:

nullCond = df.Email.isnull()    
# or nullCond = (df.Email == "") it those are empty strings

df.loc[nullCond, 'Subscriberkey'] = "string" + nullCond[nullCond].cumsum().astype(str) + "string"

enter image description here

Upvotes: 4

Related Questions