Reputation: 5188
I noticed this behavior, not sure it's a bug. I create a dataframe with 2 integer columns and 1 float column
import pandas as pd
df = pd.DataFrame([[1,2,0.2],[3,2,0.1]])
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 3 columns):
0 2 non-null int64
1 2 non-null int64
2 2 non-null float64
dtypes: float64(1), int64(2)
If I output that to Json, the dtype information is lost:
df.to_json(orient= 'records')
'[{"0":1.0,"1":2.0,"2":0.2},{"0":3.0,"1":2.0,"2":0.1}]'
All data is converted to float. This is a problem if for example one column contains ns timestamps, because they are converted to exponential notation and the sub-second information is lost.
I also filed the issue here: https://github.com/pydata/pandas/issues/7583
The result I was expecting is:
'[{"0":1,"1":2,"2":0.2},{"0":3,"1":2,"2":0.1}]'
Upvotes: 7
Views: 7383
Reputation: 1
Yes, I noticed the same behaviour , if all values in a column looks like a type, it will just auto-detect and convert it. In m case, I had to convert it back to dataframe later in another module and it was a nightmare, so I had to store the datatype information in-order to maintain data integrity(I found that this is much better than using other json format like "records" as supposed to "split" or "index", Like this :
# Before converting to JSON , store the datatypes using
data_types = data.dtypes
#I stored it in my redis server
#then after converting back :
def ensure_data_type_integrity(data, data_types):
`enter code here`for column, dtype in data_types.iteritems():
if dtype == 'object':
try:
data[column] = data[column].astype(dtype)
except ValueError:
print(f"Conversion error: Unable to convert column '{column}' to '{dtype}'")
Upvotes: 0
Reputation: 375475
One way is to view the DataFrame columns with object dtype:
In [11]: df1 = df.astype(object)
In [12]: df1.to_json()
Out[12]: '{"0":{"0":1,"1":3},"1":{"0":2,"1":2},"2":{"0":0.2,"1":0.1}}'
In [13]: df1.to_json(orient='records')
Out[13]: '[{"0":1,"1":2,"2":0.2},{"0":3,"1":2,"2":0.1}]'
Upvotes: 2