Martim Passos
Martim Passos

Reputation: 317

Add index to duplicated items in Pandas Series

I wrote the following function to add indexes to duplicates in a series:

(["foo", "foo", "foo", "bar", "bar"] becomes ["foo 1", "foo 2", "foo 3", "bar 1", "bar 2"])

def indexer(series):
  all_labels = []
  for title in set(series): 
    label = []
    i = 0
    while i < len(series): 
      if title == series.iloc[i]:
        label.append(title)
      i += 1
    all_labels.append(label)
  final = []
  for item in all_labels:
    if len(item) > 1:
      for i, label in enumerate(item):
        final.append(label + " " + str(i+1))
    else:
      final.append(item[0])
  return final

There is obviously a better and cleaner way to do this, probably using Pandas groupby and agg (although I'm not sure how they behave with a single series instead of df). Would someone please shed some light on how to do it? Thanks

Upvotes: 1

Views: 741

Answers (2)

AtanuCSE
AtanuCSE

Reputation: 8940

If single appearance needs to be left alone.

['foo', 'foo', 'foo', 'bar', 'bar', 'John']

mylist = list(df)
m = map(lambda x: x[1]+ " " + str(mylist[:x[0]].count(x[1]) + 1) if mylist.count(x[1]) > 1 else x[1], enumerate(mylist))
m = list(m)
df = pd.Series(m)
df

Output:

0    foo 1
1    foo 2
2    foo 3
3    bar 1
4    bar 2
5    John
dtype: object

John didn't get any number with him. Hurray!

Upvotes: 2

Dan
Dan

Reputation: 45752

If it's a DataFrame you can use groupby to find a cumulative count which is the label you want to concatenate to all your strings, and note the groups don't have to be in order:

df = pd.DataFrame(["foo", "foo", "bar", "bar", "foo"], columns=["baz"])
labels = df.groupby("baz").cumcount() + 1
df["baz"] + " " + labels.astype(str)

which results in

0    foo 1
1    foo 2
2    bar 1
3    bar 2
4    foo 3
dtype: object

However this will also add the 1 label to any unique values. Did you want those to remain unchanged? I assumed not since you're starting the others at 1 instead of leaving the first in each group unchanged.

Upvotes: 3

Related Questions