Reputation: 9213
I have a dataframe of that contains some date objects. I need to convert to a json for use in JavaScript, which requires YYYY-MM-DD, but to_json()
keeps adding a time component. I've seen a number of answers that convert to a string first, but this is part of a loop of about 15 queries each with many columns (simplified it for the SO question) and I don't want to hardcode each column conversion as there are a lot.
import pandas as pd
from datetime import date
df = pd.DataFrame(data=[[date(year=2018, month=1, day=1)]])
print df.to_json(orient='records', date_format='iso', date_unit='s')
Output:
[{"0":"2018-01-01T00:00:00Z"}]
Desired Output:
[{"0":"2018-01-01"}]
Upvotes: 5
Views: 6255
Reputation: 13880
Generic solution would be as follows:
df.assign( **df.select_dtypes(['datetime']).astype(str).to_dict('list') ).to_json(orient="records")
Based on the dtype it selects the datetime columns and set these as str objects so the date format is kept during serialization.
Upvotes: 0
Reputation: 1
I had that problem as well, but since I was looking only for the date, discarding the timezone, I was able to go around that using the following expression:
df = pd.read_json('test.json')
df['date_hour'] = [datetime.strptime(date[0:10],'%Y-%m-%d').date() for date in df['date_hour']]
So, if you have 'iso' date_format for df[date_hour] in the json file = "2018-01-01T00:00:00Z" you may use this solution.
This way you can extract the bit that really matters. Important to say that you must do it using this list comprehension, because the conversion can only be done string by string (or row by row), otherwise, the datetime.strptime alone, would throw an error saying that cannot be used with series.
Upvotes: 0
Reputation: 210832
Demo:
Source DF:
In [249]: df = pd.DataFrame({
...: 'val':np.random.rand(5),
...: 'date1':pd.date_range('2018-01-01',periods=5),
...: 'date2':pd.date_range('2017-12-15',periods=5)
...: })
In [250]: df
Out[250]:
date1 date2 val
0 2018-01-01 2017-12-15 0.539349
1 2018-01-02 2017-12-16 0.308532
2 2018-01-03 2017-12-17 0.788588
3 2018-01-04 2017-12-18 0.526541
4 2018-01-05 2017-12-19 0.887299
In [251]: df.dtypes
Out[251]:
date1 datetime64[ns]
date2 datetime64[ns]
val float64
dtype: object
You can cast datetime
columns to strings in one command:
In [252]: df.update(df.loc[:, df.dtypes.astype(str).str.contains('date')].astype(str))
In [253]: df.dtypes
Out[253]:
date1 object
date2 object
val float64
dtype: object
In [254]: df.to_json(orient='records')
Out[254]: '[{"date1":"2018-01-01","date2":"2017-12-15","val":0.5393488718},{"date1":"2018-01-02","date2":"2017-12-16","val":0.3085324043},{"
date1":"2018-01-03","date2":"2017-12-17","val":0.7885879674},{"date1":"2018-01-04","date2":"2017-12-18","val":0.5265407505},{"date1":"2018-0
1-05","date2":"2017-12-19","val":0.887298853}]'
Alternatively you can cast date columns to strings on the SQL side
Upvotes: 1
Reputation: 7136
You could use strftime('%Y-%m-%d')
format like so:
df = pd.DataFrame(data=[[date(year=2018, month=1, day=1).strftime('%Y-%m-
%d')]]
print(df.to_json(orient='records', date_format='iso', date_unit='s'))
# [{"0":"2018-01-01"}]
I think this is the best approach for now until pandas adds a way to write only the date out of datetime.
Upvotes: 3
Reputation: 362557
Pandas does not currently have the feature. There is an open issue about this, you should subscribe to the issue in case more options for the date_format
argument are added in a future release (which seems like a reasonable feature request):
No way with to_json to write only date out of datetime #16492
Manually converting the relevant columns to string before dumping out json is likely the best option.
Upvotes: 7