Ameya Kulkarni
Ameya Kulkarni

Reputation: 53

rename duplicate key in json file python

I have json file which has duplicate keys.

Example

{
  "data":"abc",
  "data":"xyz"
}

I want to make this as { "data1":"abc", "data2":"xyz" }

I tried using object_pairs_hook with json_loads, but it is not working. Could anyone one help me with Python solution for above problem

Upvotes: 2

Views: 1017

Answers (2)

Oleksii Filonenko
Oleksii Filonenko

Reputation: 1653

A quick and dirty solution using re.

import re

s = '{ "data":"abc", "data":"xyz", "test":"one", "test":"two", "no":"numbering" }'

def find_dupes(s):
    keys = re.findall(r'"(\w+)":', s)
    return list(set(filter(lambda w: keys.count(w) > 1, keys)))

for key in find_dupes(s):
    for i in range(1, len(re.findall(r'"{}":'.format(key), s)) + 1):
        s = re.sub(r'"{}":'.format(key), r'"{}{}":'.format(key, i), s, count=1)

print(s)

Prints this string:

{
    "data1":"abc",
    "data2":"xyz",
    "test1":"one",
    "test2":"two",
    "no":"numbering"
}

Upvotes: 0

Or Duan
Or Duan

Reputation: 13810

You can pass the load method a keyword parameter to handle pairing, there you can check for duplicates like this:

raw_text_data = """{
  "data":"abc",
  "data":"xyz",
  "data":"xyz22"
}"""
def manage_duplicates(pairs):
    d = {}
    k_counter = Counter(defaultdict(int))
    for k, v in pairs:
        d[k+str(k_counter[k])] = v
        k_counter[k] += 1

    return d

print(json.loads(raw_text_data, object_pairs_hook=manage_duplicates))

I used Counter to count each key, if it already exists, I'm saving the key as k+str(k_counter[k) - so it will be added with a trailing number.

P.S

If you have control on the input, I would highly recommend to change your json structure to:

{"data": ["abc", "xyz"]}

The rfc 4627 for application/json media type recommends unique keys but it doesn't forbid them explicitly:

The names within an object SHOULD be unique.

Upvotes: 2

Related Questions