Paul Fitzgerald
Paul Fitzgerald

Reputation: 12129

removing json items from array if value is duplicate python

I am incredibly new to python.

I have an array full of json objects. Some of the json objects contain duplicated values. The array looks like this:

 [{"id":"1","name":"Paul","age":"21"},
  {"id":"2","name":"Peter","age":"22"},
  {"id":"3","name":"Paul","age":"23"}]

What I am trying to do is to remove an item if the name is the same as another json object, and leave the first one in the array.

So in this case I should be left with

  [{"id":"1"."name":"Paul","age":"21"},
  {"id":"2","name":"Peter","age":"22"}]

The code I currently have can be seen below and is largely based on this answer:

import json
ds = json.loads('python.json') #this file contains the json
unique_stuff = { each['name'] : each for each in ds }.values()

all_ids = [ each['name'] for each in ds ]
unique_stuff = [ ds[ all_ids.index(text) ] for text in set(texts) ]

print unique_stuff

I am not even sure that this line is working ds = json.loads('python.json') #this file contains the json as when I try and print ds nothing shows up in the console.

Upvotes: 0

Views: 5319

Answers (3)

pkacprzak
pkacprzak

Reputation: 5629

First of all, your json snippet has invalid format - there are dot instead of commas separating some keys.

You can solve your problem using a dictionary with names as keys:

import json

with open('python.json') as fp:
    ds = json.load(fp) #this file contains the json

    mem = {}

    for record in ds:
        name = record["name"]
        if name not in mem:
            mem[name] = record

    print mem.values()

Upvotes: 1

gboffi
gboffi

Reputation: 25033

If you need to keep the first instance of "Paul" in your data a dictionary comprehension gives you the opposite result.

A simple solution could be as following

new = []
seen = set()
for record in old:
    name = record['name']
    if name not in seen:
        seen.add(name)
        new.append(record)
del seen

Upvotes: 3

Abhijit
Abhijit

Reputation: 63737

You might have overdone in your approach. I might tend to rewrite the list as a dictionary with "name" as a key and then fetch the values

ds = [{"id":"1","name":"Paul","age":"21"},
  {"id":"2","name":"Peter","age":"22"},
  {"id":"3","name":"Paul","age":"23"}]

{elem["name"]:elem for elem in ds}.values()
Out[2]: 
[{'age': '23', 'id': '3', 'name': 'Paul'},
 {'age': '22', 'id': '2', 'name': 'Peter'}]

Off-course the items within the dictionary and the list may not be ordered, but I do not see much of a concern. If it is, let us know and we can think over it.

Upvotes: 4

Related Questions