Reputation: 85
I would like to merge multiple JSON files into one file. All of those files have the same structure. For example I've created three files which would look like this:
ExampleFile_1
{
"items": [
{
"answers": [
{
"creation_date": 1538172165
},
{
"creation_date": 1538172205
},
{
"creation_date": 1538172245
}
],
"creation_date": 1538172012,
"question_id": 52563137
}
]
}
ExampleFile_2
{
"items": [
{
"answers": [
{
"creation_date": 1538326991
}
],
"creation_date": 1538172095,
"question_id": 52563147
},
{
"answers": [
{
"creation_date": 1538180453
}
],
"creation_date": 1538172112,
"question_id": 52563150
}
]
}
ExampleFile_3
{
"items": [
{
"answers": [
{
"creation_date": 1538326991
}
],
"creation_date": 1538172095,
"question_id": 52563147
}
]
}
Now I would like to merge all three files inside the "items"
list into one file which then would like this:
merged_json.json
{
"items": [
{
"answers": [
{
"creation_date": 1538172165
},
{
"creation_date": 1538172205
},
{
"creation_date": 1538172245
}
],
"creation_date": 1538172012,
"question_id": 52563137
},
{
"answers": [
{
"creation_date": 1538326991
}
],
"creation_date": 1538172095,
"question_id": 52563147
},
{
"answers": [
{
"creation_date": 1538180453
}
],
"creation_date": 1538172112,
"question_id": 52563150
},
{
"answers": [
{
"creation_date": 1538326991
}
],
"creation_date": 1538172095,
"question_id": 52563147
}
]
}
So like above the "items"
should be concatenated.
I already tried to come up with a solution but could not figure it out. This is what I got so far:
read_files = glob.glob("ExampleFile*.json")
output_list = []
for f in read_files:
with open(f, "rb") as infile:
output_list.append(json.load(infile))
all_items = []
for json_file in output_list:
all_items += json_file['items']
textfile_merged = open('merged_json.json', 'w')
textfile_merged.write(str(all_items))
textfile_merged.close()
This, unfortunately, leaves me with a messed up json file which only consists of the dicts inside "items"
.
How do I create such a file like merged_json.json
?
Thanks in advance.
Upvotes: 7
Views: 71746
Reputation: 73
If you just want to merge all json files sequentially,
go to the folder where all json files are, select all and rename the first one as "yourchoice", by doing this all will be in sequential order i.e. yourchoice1,yourchoice2 ...
next go to cmd and type : copy *.json "outputfilename".json
All of your json files are merged sequentially into the "outputfilename".json file
Upvotes: 1
Reputation: 1347
I suggest you to use json
, which is specific for JSON object manipulation. You can do something like this:
import json
with open('example1.json') as f:
data1 = json.load(f)
with open('example2.json') as f:
data2 = json.load(f)
with open('example3.json') as f:
data3 = json.load(f)
items1 = data1["items"]
#print(json.dumps(items1, indent=2))
items2 = data2["items"]
items3 = data3["items"]
listitem = [items1, items2, items3]
finaljson = {"items" : []}
finaljson["items"].append(items1)
finaljson["items"].append(items2)
finaljson["items"].append(items3)
print(json.dumps(finaljson, indent=2))
with open('merged_json.json', "w") as f:
f.write(json.dumps(finaljson, indent=2))
where json.load()
convert a string to a json object, while json.dumps()
convert a json to a string. The parameter indent
let you print the object in the expanded way.
Upvotes: 0
Reputation:
You're using the json
module to convert the JSON file into Python objects, but you're not using the module to convert those Python objects back into JSON. Instead of this at the end
textfile_merged.write(str(all_items))
try this:
json.dump({ "items": all_items }, textfile_merged)
(Note that this is also wrapping the all_items
array in a dictionary so that you get the output you expect, otherwise the output will be a JSON array, not an object with an "items"
key).
Upvotes: 4
Reputation: 45
read_files = glob.glob("ExampleFile*.json")
output_list = []
for f in read_files:
with open(f, "rb") as infile:
output_list.append(json.load(infile))
final_json = {}
all_items = []
for json_file in output_list:
all_items.extend(json_file['items'])
final_json['items'] = all_items
textfile_merged = open('merged_json.json', 'w')
textfile_merged.write(str(final_json))
Upvotes: 0
Reputation: 366
A way you could do it and which would result in cleaner code in to define a function that takes two JSON objects and return the combination of those two.
def merge (json_obj_1, json_obj_2):
items = json_obj_1['items'] + json_obj_2['items']
return { 'items': items }
Then, after you have output_list:
result = reduce(merge, output_list)
Result will be the object you are looking for.
If you're not familiar with the reduce function, check out this web page:
http://book.pythontips.com/en/latest/map_filter.html
It briefly explains the usage of reduce, as well as map and filter. They are very useful.
Upvotes: 0