Reputation: 1591
I'm trying to save a Pandas data frame as a JSON file. One of the columns has the type datetime64[ns]. When I print the column the correct date is displayed. However, when I save it as a JSON file and read it back later the date value changes even after I change the column type back to datetime64[ns]. Here is a sample app which shows the issue I am running into:
#!/usr/bin/python3
import json
import pandas as pd
s = pd.Series(['1/1/2000'])
in_df = pd.DataFrame(s)
in_df[0] = pd.to_datetime(in_df[0], format='%m/%d/%Y')
print("in_df\n")
print(in_df)
print("\nin_df dtypes\n")
print(in_df.dtypes)
in_df.to_json("test.json")
out_df = pd.read_json("test.json")
out_df[0] = out_df[0].astype('datetime64[ns]')
print("\nout_df\n")
print(out_df)
print("\nout_df dtypes\n")
print(out_df.dtypes)
Here is the output:
in_df
0
0 2000-01-01
in_df dtypes
0 datetime64[ns]
dtype: object
out_df
0
0 1970-01-01 00:15:46.684800 <--- Why don't I get 2000-1-1 here?
out_df dtypes
0 datetime64[ns]
dtype: object
I'm expecting the get the original date displayed (2000-1-1) when I read back the JSON file. What am I doing wrong with my conversion? Thanks!
Upvotes: 1
Views: 248
Reputation: 2696
df = pd.read_json("test.json")
df[0] = pd.to_datetime(df[0], unit='ms')
print("\ndf\n")
print(df)
print("\ndf dtypes\n")
print(df.dtypes)
will give you
df
0
0 2000-01-01
df dtypes
0 datetime64[ns]
dtype: object
This should work for all millisecond json columns you need
Upvotes: 1