Reputation: 312
I have a json like below
a = {"infinity_war":["Stark", "Hulk", "Rogers", "Thanos"],
"end_game":["Stark", "Dr.Strange", "Peter"]}
Since the name "Stark" is repeating more than once in the whole json I need to keep only one occurrence of "Stark" and remove the others. I tried using pandas but it needs all the list with same length. Is there any other way. The result I need is
a = {"infinity_war":["Stark", "Hulk", "Rogers", "Thanos"],
"end_game":["Dr.Strange", "Peter"]}
Upvotes: 1
Views: 37
Reputation: 260965
You can use a simple loop and a set to keep track of the seen elements:
seen = set()
b = {}
for k,l in a.items():
b[k] = [x for x in l if not (x in seen or seen.add(x))]
output:
{'infinity_war': ['Stark', 'Hulk', 'Rogers', 'Thanos'],
'end_game': ['Dr.Strange', 'Peter']}
How it works:
for each key/list pair, iterate over the elements of the list. If an element is found in the seen set, skip adding it to the new list, else append it to the seen set and add it to the new list.
seen.add(x)
is always False as set.add
returns None, so (x in seen or seen.add(x))
has the boolean value of x in seen
, which we invert with not
.
Upvotes: 1