JVM
JVM

Reputation: 99

Count unique values in a JSON

I have a json called thefile.json which looks like this:

{
  "domain": "Something",
  "domain": "Thingie",
  "name": "Another",
  "description": "Thing"
}

I am trying to write a python script which would made a set of the values in domain. In this example it would return

{'Something', 'Thingie'}

Here is what I tried:

import json
with open("thefile.json") as my_file: 
  data = json.load(my_file)
  ids = set(item["domain"] for item in data.values())
print(ids)

I get the error message

    unique_ids.add(item["domain"])
TypeError: string indices must be integers

Having looked up answers on stack exchange, I'm stumped. Why can't I have a string as an index, seeing as I am using a json whose data type is a dictionary (I think!)? How do I get it so that I can get the values for "domain"?

Upvotes: 1

Views: 232

Answers (2)

cabesuon
cabesuon

Reputation: 5270

You have a problem in your JSON, duplicate keys. I am not sure if it is forbiden, but I am sure it is bad formatted. Besides that, of course it is gonna bring you lot of problems.

A dictionary can not have duplicate keys, what would be the return of a duplicate key?.

So, fix your JSON, something like this,

{
  "domain": ["Something", "Thingie"],
  "name": "Another",
  "description": "Thing"
}

Guess what, good format almost solve your problem (you can have duplicates in the list) :)

Upvotes: 1

oamandawi
oamandawi

Reputation: 369

So, to start, you can read more about JSON formats here: https://www.w3schools.com/python/python_json.asp

Second, dictionaries must have unique keys. Therefore, having two keys named domain is incorrect. You can read more about python dictionaries here: https://www.w3schools.com/python/python_dictionaries.asp

Now, I recommend the following two designs that should do what you need:

  1. Multiple Names, Multiple Domains: In this design, you can access websites and check the domain of each of its values like ids = set(item["domain"] for item in data["websites"])
{
  "websites": [
    {
      "domain": "Something.com",
      "name": "Something",
      "description": "A thing!"
    },
    {
      "domain": "Thingie.com",
      "name": "Thingie",
      "description": "A thingie!"
    },
  ]
}
  1. One Name, Multiple Domains: In this design, each website has multiple domains that can be accessed using JVM_Domains = set(data["domains"])
{
   "domains": ["Something.com","Thingie.com","Stuff.com"]
   "name": "Me Domains",
   "description": "A list of domains belonging to Me"
}

I hope this helps. Let me know if I missed any details.

Upvotes: 2

Related Questions