Reputation: 52153
I have a string representation of a JSON object.
dumped_dict = '{"debug": false, "created_at": "2020-08-09T11:24:20"}'
When I call json.loads with this object;
json.loads(dumped_dict)
I get;
{'created_at': '2020-08-09T11:24:20', 'debug': False}
There is nothing wrong in here. However, I want to know if there is a way to convert the above object with json.loads to something like this:
{'created_at': datetime.datetime(2020, 08, 09, 11, 24, 20), 'debug': False}
Shortly, are we able to convert datetime strings to actual datetime.datetime objects while calling json.loads?
Upvotes: 53
Views: 61607
Reputation: 1
While some time has passed since the initial question, I'd like to provide an alternative solution for the benefit of future visitors.
I've recently introduced a package on PyPI called pyjschema, which offers the capability to work with JSON schemas and convert JSON data into Pythonic types based on those schemas.
In your specific scenario, you can define a schema that specifies the data type for the created_at
property. Subsequently, you can parse the JSON data in alignment with this schema:
from pyjschema import loads
schema = {
'type': 'object',
'properties': {
'created_at': {'type': 'string', 'format': 'date-time'}
}
}
print(loads('{"debug": false, "created_at": "2020-08-09T11:24:20"}', schema))
# prints: {'created_at': datetime.datetime(2020, 8, 9, 11, 24, 20), 'debug': False}
Upvotes: 0
Reputation: 3111
if you are looking for django json steriliser:
from django.utils.timezone import now
from django.core.serializers.json import DjangoJSONEncoder
from django.utils.dateparse import parse_datetime
dt = now()
sdt = json.dumps(dt.strftime('%Y-%m-%dT%H:%M:%S'))
ndt = parse_datetime(json.loads(sdt))
print(sdt)
# "2022-04-27T12:20:23"
print(ndt)
# 2022-04-27 12:20:23
Upvotes: 1
Reputation: 331
In most of the cases, this is a two way problem, if you make use of a custom encoder you'll probably want to have a custom decoder (and vice-versa). In this case the decoder should be able to parse the encoded data and return the original json object.
Below there's a ful excersise to convert python non-serializable objects to json using 2 different strategies:
in the example below, I serialize a Enum class using a custom json method as {enum.name: enum.value} dict, here the enun.value objects are non serializable types in python (date and tuple), by using the methods listed CONVERTERS we can convert these types to serializable types.
Once encoded, the custom_json_decoder method can be invoked to convert that json back to python primitive types. This script exaple below is complete, it should run "as is":
from enum import Enum
from dateutil.parser import parse as dtparse
from datetime import datetime
from datetime import date
from json import JSONEncoder
from json import loads as json_loads
from json import dumps as json_dumps
def wrapped_default(self, obj):
json_parser = getattr(obj.__class__, "__json__", lambda x: x.__dict__)
try:
return json_parser(obj)
except Exception:
return wrapped_default.default(obj)
wrapped_default.default = JSONEncoder().default
JSONEncoder.default = wrapped_default
CONVERTERS = {
"datetime": dtparse,
"date": lambda x: datetime.strptime(x, "%Y%m%d").date(),
"tuple": lambda x: tuple(x),
}
class RskJSONEncoder(JSONEncoder):
def default(self, obj):
if isinstance(obj, date):
return {"val": obj.strftime("%Y%m%d"), "pythontype": "date"}
elif isinstance(obj, datetime):
return {"val": obj.isoformat(), "pythontype": "datetime"}
elif isinstance(obj, tuple):
return {"val": list(obj), "pythontype": "tuple"}
return super().default(obj)
def custom_json_decoder(obj):
def json_hook(json_obj):
try:
return CONVERTERS[json_obj.pop("pythontype")](json_obj["val"])
except Exception:
res = json_obj
return res
return json_loads(obj, object_hook=json_hook)
def custom_json_encoder(obj):
return json_dumps(obj, cls=RskJSONEncoder)
if __name__ == "__main__":
class Test(Enum):
A = date(2021, 1, 1)
B = ("this", " is", " a", " tuple")
def __json__(self):
return {self.name: self.value}
d = {"enum_date": Test.A, "enum_tuple": Test.B}
this_is_json = custom_json_encoder(d)
this_is_python_obj = custom_json_decoder(this_is_json)
print(f"this is json, type={type(this_is_json)}\n", this_is_json)
print(
f"this is python, type={type(this_is_python_obj)}\n",
this_is_python_obj,
)
Upvotes: 0
Reputation: 11612
The solutions that suggest creating a JSON encoder and decoder are all perfectly valid. The only thing I can see wrong with this is a slight performance impact, which might happen if you're scanning each JSON value to check to match against a date/time format.
Here's the approach I would take, using the dataclass-wizard library (note: it is designed to work for API responses actually)
Use the included CLI utility to convert the JSON response to a dataclass schema. Note that the value of debug
is encoded as a string here, so I'm passing -f
so that it force-resolves to a Python bool type. Otherwise, it should appear as Union[bool, str]
, which is the default inferred type.
$ echo '{"debug": "false", "created_at": "2020-08-09T11:24:20"}' | wiz gs -f
Output, including the imports at the top (not shown):
@dataclass
class Data(JSONWizard):
"""
Data dataclass
"""
debug: bool
created_at: datetime
Now we can de-serialize the sample JSON string above into a Data
object. Note that
created_at
should come as datetime
type. Similarly with the value for debug
, it should be decoded as bool
.
string = """{"debug": "false", "created_at": "2020-08-09T11:24:20"}"""
c = Data.from_json(string)
print(repr(c))
Serialize it back to JSON. The datetime
object should be converted back
a string:
print(c.to_json())
# {"debug": false, "createdAt": "2020-08-09T11:24:20"}
Upvotes: 0
Reputation: 20155
Although it technically works just to give the an object hook function, I recommend to use a proper subclass of JSONDecoder
as it is intended by the framework developers:
class _JSONDecoder(json.JSONDecoder):
def __init__(self, *args, **kwargs):
json.JSONDecoder.__init__(
self, object_hook=self.object_hook, *args, **kwargs)
def object_hook(self, obj):
ret = {}
for key, value in obj.items():
if key in {'timestamp', 'whatever'}:
ret[key] = datetime.fromisoformat(value)
else:
ret[key] = value
return ret
For the sake of completeness, here is the counterpart to the decoder, the custom JSONEncoder:
class _JSONEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, (datetime.date, datetime.datetime, pd.Timestamp)):
return obj.isoformat()
return json.JSONEncoder.default(obj)
Both in action look like:
json_str = json.dumps({'timestamp': datetime.datetime.now()}, cls=_JSONEncoder)
d = json.loads(json_str, cls=_JSONDecoder)
Upvotes: 12
Reputation: 21
The method implements recursive string search in date-time format
import json
from dateutil.parser import parse
def datetime_parser(value):
if isinstance(value, dict):
for k, v in value.items():
value[k] = datetime_parser(v)
elif isinstance(value, list):
for index, row in enumerate(value):
value[index] = datetime_parser(row)
elif isinstance(value, str) and value:
try:
value = parse(value)
except (ValueError, AttributeError):
pass
return value
json_to_dict = json.loads(YOUR_JSON_STRING, object_hook=datetime_parser)
Upvotes: 2
Reputation: 2784
Inspired by Nicola's answer and adapted to python3 (str instead of basestring):
import re
from datetime import datetime
datetime_format = "%Y-%m-%dT%H:%M:%S"
datetime_format_regex = re.compile(r'^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}$')
def datetime_parser(dct):
for k, v in dct.items():
if isinstance(v, str) and datetime_format_regex.match(v):
dct[k] = datetime.strptime(v, datetime_format)
return dct
This avoids using a try/except mechanism. On OP's test code:
>>> import json
>>> json_string = '{"debug": false, "created_at": "2020-08-09T11:24:20"}'
>>> json.loads(json_string, object_hook=datetime_parser)
{'created_at': datetime.datetime(2020, 8, 9, 11, 24, 20), 'debug': False}
The regex and datetime_format
variables can be easily adapted to fit other patterns, e.g. without the T in the middle.
To convert a string saved in isoformat (therefore stored with microseconds) back to a datetime object, refer to this question.
Upvotes: 1
Reputation: 2308
I would do the same as Nicola suggested with 2 changes:
dateutil.parser
instead of datetime.datetime.strptime
except:
Or in code:
import dateutil.parser
def datetime_parser(json_dict):
for (key, value) in json_dict.items():
try:
json_dict[key] = dateutil.parser.parse(value)
except (ValueError, AttributeError):
pass
return json_dict
str = "{...}" # Some JSON with date
obj = json.loads(str, object_hook=datetime_parser)
print(obj)
Upvotes: 7
Reputation: 2185
You could use regex to determine whether or not you want to convert a certain field to datetime like so:
def date_hook(json_dict):
for (key, value) in json_dict.items():
if type(value) is str and re.match('^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d*$', value):
json_dict[key] = datetime.datetime.strptime(value, "%Y-%m-%dT%H:%M:%S.%f")
elif type(value) is str and re.match('^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}$', value):
json_dict[key] = datetime.datetime.strptime(value, "%Y-%m-%dT%H:%M:%S")
else:
pass
return json_dict
Then you can reference the date_hook function using the object_hook parameter in your call to json.loads():
json_data = '{"token": "faUIO/389KLDLA", "created_at": "2016-09-15T09:54:20.564"}'
data_dictionary = json.loads(json_data, object_hook=date_hook)
Upvotes: 3
Reputation: 2019
You need to pass an object_hook. From the documentation:
object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict.
Like this:
import datetime
import json
def date_hook(json_dict):
for (key, value) in json_dict.items():
try:
json_dict[key] = datetime.datetime.strptime(value, "%Y-%m-%dT%H:%M:%S")
except:
pass
return json_dict
dumped_dict = '{"debug": false, "created_at": "2020-08-09T11:24:20"}'
loaded_dict = json.loads(dumped_dict, object_hook=date_hook)
If you also want to handle timezones you'll have to use dateutil instead of strptime.
Upvotes: 28
Reputation: 6576
My solution so far:
>>> json_string = '{"last_updated": {"$gte": "Thu, 1 Mar 2012 10:00:49 UTC"}}'
>>> dct = json.loads(json_string, object_hook=datetime_parser)
>>> dct
{u'last_updated': {u'$gte': datetime.datetime(2012, 3, 1, 10, 0, 49)}}
def datetime_parser(dct):
for k, v in dct.items():
if isinstance(v, basestring) and re.search("\ UTC", v):
try:
dct[k] = datetime.datetime.strptime(v, DATE_FORMAT)
except:
pass
return dct
For further reference on the use of object_hook: JSON encoder and decoder
In my case the json string is coming from a GET request to my REST API. This solution allows me to 'get the date right' transparently, without forcing clients and users into hardcoding prefixes like __date__
into the JSON, as long as the input string conforms to DATE_FORMAT which is:
DATE_FORMAT = '%a, %d %b %Y %H:%M:%S UTC'
The regex pattern should probably be further refined
PS: in case you are wondering, the json_string is a MongoDB/PyMongo query.
Upvotes: 33
Reputation: 4983
The way that your question is put, there is no indication to json that the string is a date value. This is different than the documentation of json which has the example string:
'{"__complex__": true, "real": 1, "imag": 2}'
This string has an indicator "__complex__": true
that can be used to infer the type of the data, but unless there is such an indicator, a string is just a string, and all you can do is to regexp your way through all strings and decide whether they look like dates.
In your case you should definitely use a schema if one is available for your format.
Upvotes: 3
Reputation: 4199
As far as I know there is no out of the box solution for this.
First of all, the solution should take into account json schema to correctly distinguish between strings and datetimes. To some extent you can guess schema with json schema inferencer (google for json schema inferencer github) and then fix the places which are really datetimes.
If the schema is known, it should be pretty easy to make a function, which parses json and substitutes string representations with datetime. Some inspiration for the code could perhaps be found from validictory product (and json schema validation could be also good idea).
Upvotes: 1