Matias Ordoñez
Matias Ordoñez

Reputation: 105

Parse Date before insert pymongo

I have the following type of json document which I need to insert in a mongodb collection with pymongo :

json={
   "resource": "/items/6791111",
   "user_id": 123456789,
   "topic": "items",
   "application_id":001,
   "attempts": 1,
   "sent": "2020-07-22T15:53:06.000-04:00",
   "received":"2020-07-22T15:53:06.000-04:00"
 }

the fields sent and received are strings so if I run :

collection.insert_one(json)

this will be saved as string in the database, how can I store directly as a date?

I tried something like this:

from dateutil.parser import parse

json['sent']=parse(json['sent'])
collection.insert_one(json)

but doesn't seems to me pretty good solution because I have documents which in some cases have several date fields or sometimes some date field is null (example in a order maybe the delivered field is null until the order is delivered)

something like this:

json2={
   "resource": "/items/6791111",
   "user_id": 123456789,
   "topic": "items",
   "application_id":001,
   "attempts": 1,
   "sent": "2020-07-22T15:53:06.000-04:00",
   "received":Null
 }

now I'm parsing the dates by hand using a function, but its really not useful at all

And I need to have the datefield parsed as dates so I can filter by time.

Upvotes: 0

Views: 315

Answers (1)

Belly Buster
Belly Buster

Reputation: 8814

You can use attempt isoparse on each field which will convert any valid dates to datetime format and will therefore be stored in MongoDB as a BSON date type. Nulls will be unaffected.

from dateutil.parser import isoparse
k, v in json.items():
    try:
        json[k] = isoparse(v)
    except Exception:
        pass

Full worked example:

from pymongo import MongoClient
from dateutil.parser import isoparse
import pprint

collection = MongoClient()['mydatabase'].collection

json={
   "resource": "/items/6791111",
   "user_id": 123456789,
   "topic": "items",
   "application_id":1,
   "attempts": 1,
   "sent": "2020-07-22T15:53:06.000-04:00",
   "received":"2020-07-22T15:53:06.000-04:00",
}

for k, v in json.items():
    try:
        json[k] = isoparse(v)
    except Exception:
        pass

collection.insert_one(json)

pprint.pprint(collection.find_one(), indent=4)

gives:

{   '_id': ObjectId('5fde015e794ced49eeaa7a65'),
    'application_id': 1,
    'attempts': 1,
    'nulldate': None,
    'received': datetime.datetime(2020, 7, 22, 19, 53, 6),
    'resource': '/items/6791111',
    'sent': datetime.datetime(2020, 7, 22, 19, 53, 6),
    'topic': 'items',
    'user_id': 123456789}

Upvotes: 1

Related Questions