Reputation: 185
I have a dict that contains keys like this:
237870a/
237870b/
237870c/
115460a/
115460b/
115460c/
115460d/
229898/
212365a/
109678/
I need to iterate over this list of keys and pull out certain items:
For items that share the same numeric prefix and have an alphabetic character at the end, I need the item with the highest character, i.e. in this case 237870c
, 115460d
, and 212365a
.
Any other item with a unique number without a trailing alphabetic character, i.e. 229898
& 109678
So, my result should be:
237870c/
115460d/
229898/
212365a/
109678/
sorry I don't have any code to show as i'm really not sure how to even start writing this...
Upvotes: 1
Views: 75
Reputation: 20718
First of all, this has nothing to do with dictionaries: as you said yourself, you’re operating on a list of keys. The origin of the list isn’t important.
You can use itertools.groupby
for this, with a clever key function. For itertools.groupby
to work properly, we first need to sort the keys:
keys = sorted(keys)
Then we have to think about a key function. This must be designed in a way so that only the numeric prefix is used to group:
def keyfunc(item):
if item[-1].isalpha():
return item[:-1]
return item
This will strip the last character if it is alphabetic, so that itertools.groupby
won’t take it into account when grouping. We’ll then take the last element of the grouped items, which will be the one with the highest alphabetic character.
Now we can apply groupby to obtain a list of items as you need:
items = [sorted(subitems)[-1]
for _, subitems
in itertools.groupby(keys, keyfunc)]
See it in action:
>>> # output formatting and indentation by me
...
>>> keys
['237870a/', '237870b/', '237870c/', '115460a/',
'115460b/', '115460c/', '115460d/', '229898/',
'212365a/', '109678/']
>>> def keyfunc(item):
... if item[-1].isalpha():
... return item[:-1]
... return item
...
>>> items = [sorted(subitems)[-1]
... for _, subitems
... in itertools.groupby(keys, keyfunc)]
>>> items
['237870c/', '115460d/', '229898/', '212365a/', '109678/']
Upvotes: 2