madu
madu

Reputation: 5450

Inserting document to MongoDB from mongodump JSON with pymongo

I have a JSON file (converted from mongodump BSON) which I would like to insert to a MongoDB using pymongo. The approach I am using is something like:

with open('duplicate_docs.json') as f:
    lines = f.readlines() 

    for line in lines:
        record = json.loads(line)
        db.insert_one(record)

However, the JSON is in the form:

{ "_id" : ObjectId( "54ccc3f469702d45ca450200"), \"id\":\"54713efd69702d78d1420500\",\"name\":\"response"}

As you can see there are escape charaters () for the JSON keys and I am not able to load this as a JSON. What is the best way yo fix a JSON string like this so it can be used to do insert to MongoDB?

Thank you.

Upvotes: 1

Views: 509

Answers (2)

Cere
Cere

Reputation: 93

why not use mongoexport to dump to json not bson

mongoexport --port 27017 --db <database> --collection <collection> --out output.json

and then use

mongoimport --port 27017 --db <database> --collection <collection> --file output.json

Upvotes: 1

Belly Buster
Belly Buster

Reputation: 8814

As an alternative approach if you take the actual output of mongodump you can insert it straight in with the bson.json_util loads() function.

from pymongo import MongoClient
from bson.json_util import loads

db = MongoClient()['mydatabase']

with open('c:/temp/duplicate_docs.json', mode='w') as f:
    f.write('{"_id":{"$oid":"54ccc3f469702d45ca450200"},"id":"54713efd69702d78d1420500","name":"response"}')
    
with open('c:/temp/duplicate_docs.json') as f:
    lines = f.readlines()

    for line in lines:
        record = loads(line)
        db.docs.insert_one(record)

Upvotes: 1

Related Questions