Ryan Davies
Ryan Davies

Reputation: 616

Getting the max index value for each group within a list

I've been trying to figure out the most elegant way to find what the maximum index is for a particular ID within a list. The idea is that within the data I'm receiving sometimes I will get a duplicate ID. I've been told to take the most recent ID value as this is the most up to date within the list.

I've managed to implement this using Pandas which is great but I feel like there must be a better way without having to use Pandas.

import pandas as pd

list = ['A', 'A', 'B', 'C', 'C']
df = pd.DataFrame({'id': list})
df['idx'] = df.index
df = df.groupby('id').agg({'idx':'max'})
df = df.reset_index()['idx'].to_list()

print(df)

I was thinking perhaps I could do a lead/lag type function which would look at the previous ID value and if the current ID doesn't match the previous value then store the index of the previous ID.

Upvotes: 1

Views: 218

Answers (4)

RoadRunner
RoadRunner

Reputation: 26315

Adding to other solutions:

>>> lst = ['A', 'A', 'B', 'C', 'C']
>>> dict(map(reversed, enumerate(lst)))
{'A': 1, 'B': 2, 'C': 4}

Which uses maps the reversed function to each enumerate object with map, resulting in a {element: idx} dictionary.

Upvotes: 2

Serge Ballesta
Serge Ballesta

Reputation: 148890

You can use a simple comprehension here:

lst = ['A', 'A', 'B', 'C', 'C']
{j: i for i,j in enumerate(lst)}

gives:

{'A': 1, 'B': 2, 'C': 4}

NB: but please, never use list as a variable name, because it hides the builtin list functions...

Upvotes: 2

mark pedersen
mark pedersen

Reputation: 255

def maxIndex(l):
    rDict=dict()
    for x in range(len(l)):
        rDict[l[x]]=x
    return rDict

This will take in a list, and output a dictionary, where key is the entry, and value is the greatest index it appeared.

You can query the dictionary to get the largest index.

Output of maxIndex(list):

{'A': 1, 'C': 4, 'B': 2}

Upvotes: 2

Błotosmętek
Błotosmętek

Reputation: 12927

max_index = {}
data = ['A', 'A', 'B', 'C', 'C'] # don't use name "list" for variables
for i, e in enumerate(data):
    max_index[e] = i
print(max_index)

Output:

{'A': 1, 'B': 2, 'C': 4}

Upvotes: 2

Related Questions