Nicolas Gervais
Nicolas Gervais

Reputation: 36624

Append string in Pandas DataFrame with cumulative count

I have a pd.DataFrame full of picture names. Often, the image names are repeated. But, they are always next to each other. This is what it looks like:

import pandas as pd
from numpy.random import randint

df = pd.DataFrame(sorted(['image_{}'.format(randint(4)) for i in range(10)]),
     columns=['Image Name'])

print(df)
Out[6]: 
  Image Name
0    image_0
1    image_0
2    image_0
3    image_1
4    image_1
5    image_2
6    image_2
7    image_2
8    image_3
9    image_3

Because I will save the images based on this name, I want to append these strings with the cumulative count, as such:

Out[7]: 
  Image Name
0    image_0_1
1    image_0_2
2    image_0_3
3    image_1_1
4    image_1_2
5    image_2_1
6    image_2_2
7    image_2_3
8    image_3_1
9    image_3_1

How can I proceed? I'm guessing some combination of groupby and cumcount?

Upvotes: 2

Views: 351

Answers (2)

Alexander
Alexander

Reputation: 109546

df['new_name'] = (
    df
    .groupby('Image Name')['Image Name']
    .transform(lambda images: [image + f'_{n + 1}' for n, image in enumerate(images)])
)
>>> df
  Image Name   new_name
0    image_0  image_0_1
1    image_0  image_0_2
2    image_0  image_0_3
3    image_1  image_1_1
4    image_1  image_1_2
5    image_2  image_2_1
6    image_2  image_2_2
7    image_2  image_2_3
8    image_3  image_3_1
9    image_3  image_3_2

Upvotes: 3

Parfait
Parfait

Reputation: 107652

Consider groupby().cumcount() and concatenate to original string and order does not matter:

df['Image Name'] = (df['Image Name'] + '_' + 
                      (df.groupby('Image Name').cumcount() + 1).astype(str)
                   )

Upvotes: 5

Related Questions