Reputation: 821
I have list of dictionaries with nested structure. I need to remove all duplicate values. I'm newbie in Python and can't solve this task. Anyone can help me?
My list looks like:
[
{
"task_id":123,
"results":[
{
"url":"site.com",
"date":"04.18.2019"
},
{
"url":"another_site.com",
"date":"04.18.2019"
},
{
"url":"site1.com",
"date":"04.18.2019"
}
]
},
{
"task_id":456,
"results":[
{
"url":"site3.com",
"date":"04.18.2019"
},
{
"url":"site.com",
"date":"04.18.2019"
}
]
},
{
"task_id":789,
"results":[
{
"url":"site7.com",
"date":"04.18.2019"
},
{
"url":"site9.com",
"date":"04.18.2019"
},
{
"url":"site.com",
"date":"04.18.2019"
}
]
}
]
I need to set site.com only once. If any value of url is duplicated - exclude it from dict.
As result: task 123 with 3 dicts in results task 456 with 1 dict in results (exclude site.com) task 789 with 2 dict in results (exclude site.com)
Desired output should looks like:
[
{
"task_id":123,
"results":[
{
"url":"site.com",
"date":"04.18.2019"
},
{
"url":"another_site.com",
"date":"04.18.2019"
},
{
"url":"site1.com",
"date":"04.18.2019"
}
]
},
{
"task_id":456,
"results":[
{
"url":"site3.com",
"date":"04.18.2019"
}
]
},
{
"task_id":789,
"results":[
{
"url":"site7.com",
"date":"04.18.2019"
},
{
"url":"site9.com",
"date":"04.18.2019"
}
]
}
]
Upvotes: 1
Views: 605
Reputation: 71451
You can use a list comprehension:
d = [{'task_id': 123, 'results': [{'url': 'site.com', 'date': '04.18.2019'}, {'url': 'another_site.com', 'date': '04.18.2019'}, {'url': 'site1.com', 'date': '04.18.2019'}]}, {'task_id': 456, 'results': [{'url': 'site3.com', 'date': '04.18.2019'}, {'url': 'site.com', 'date': '04.18.2019'}]}, {'task_id': 789, 'results': [{'url': 'site7.com', 'date': '04.18.2019'}, {'url': 'site9.com', 'date': '04.18.2019'}, {'url': 'site.com', 'date': '04.18.2019'}]}]
new_d = [{**a, 'results':[c for c in a['results'] if all(c not in b['results'] for b in d[:i])]} for i, a in enumerate(d)]
Output:
[
{
"task_id": 123,
"results": [
{
"url": "site.com",
"date": "04.18.2019"
},
{
"url": "another_site.com",
"date": "04.18.2019"
},
{
"url": "site1.com",
"date": "04.18.2019"
}
]
},
{
"task_id": 456,
"results": [
{
"url": "site3.com",
"date": "04.18.2019"
}
]
},
{
"task_id": 789,
"results": [
{
"url": "site7.com",
"date": "04.18.2019"
},
{
"url": "site9.com",
"date": "04.18.2019"
}
]
}
]
Upvotes: 0
Reputation: 1
people = {
1: {'name': 'John',},
2: {'name': 'Marie'},
3: {'name': 'Ann',},
4: {'name': 'John'},
}
print(people)
unique = {}
for key, value in people.items():
if value not in unique.values():
unique[key] = value
print(unique)
try these
Upvotes: -1
Reputation: 1236
let results
to be your array.
u = set()
final = []
for dict in results:
for res in dict["results"]:
if res["url"] not in u:
u.add(res["url"])
final.append(res)
print(final)
Upvotes: 3