Reputation: 91
I have an data frame like this:
sku new-sku
0 FAT-001 FAT-001
1 FAT-001 FAT-001-01
2 FAT-001 FAT-001-02
3 FAT-002 FAT-002
4 FAT-002 FAT-002-01
5
6
7 FAT-003 FAT-003
8
9
here is my code:
groups = df.groupby('sku').cumcount()
df['new'] = df['sku'] + ('-' + groups.astype('string').str.zfill(2)).mask(groups.eq(0), '')
My expected result will be look like this:
sku new-sku
0 FAT-001 FAT-001
1 FAT-001 FAT-001-01
2 FAT-001 FAT-001-02
3 FAT-002 FAT-002
4 FAT-002 FAT-002-01
5 FAT-null-01
6 FAT-null-02
7 FAT-003 FAT-003
8 FAT-null-03
9 FAT-null-04
It will increment by +1 for every new null row.
The constructor:
{'sku': {0: 'FAT-001', 1: ' ', 2: ' ', 3: 'FAT-002', 4: 'FAT-002', 5: ' ', 6: ' ', 7: 'FAT-003', 8: 'FAT-003', 9: 'FAT-004'}}
Upvotes: 0
Views: 92
Reputation:
Building on my answer to your previous question, We could add a mask
when we create groups using groupby.cumcount
for the white space rows and adjust accordingly:
groups = df.groupby('sku').cumcount()
groups = groups.mask(df['sku'].eq(' '), groups+1)
df['new-sku'] = df['sku'].replace(' ', 'FAT-null') + ('-' + groups.astype('string').str.zfill(2)).mask(groups.eq(0), '')
Output:
ID sku new-sku
0 1 FAT-001 FAT-001
1 2 FAT-null-01
2 3 FAT-null-02
3 4 FAT-002 FAT-002
4 5 FAT-002 FAT-002-01
5 6 FAT-null-03
6 7 FAT-null-04
7 8 FAT-003 FAT-003
8 9 FAT-003 FAT-003-01
9 10 FAT-004 FAT-004
Upvotes: 2