dark horse
dark horse

Reputation: 3709

Python - TypeError: Object of type 'int64' is not JSON serializable

I have a data frame that stores store name and daily sales count. I am trying to insert this to Salesforce using the Python script below.

However, I get the following error:

TypeError: Object of type 'int64' is not JSON serializable

Below, there is the view of the data frame.

Storename,Count
Store A,10
Store B,12
Store C,5

I use the following code to insert it to Salesforce.

update_list = []
for i in range(len(store)):
    update_data = {
        'name': store['entity_name'].iloc[i],
        'count__c': store['count'].iloc[i] 
    }
    update_list.append(update_data)

sf_data_cursor = sf_datapull.salesforce_login()
sf_data_cursor.bulk.Account.update(update_list)

I get the error when the last line above gets executed.

How do I fix this?

Upvotes: 246

Views: 354735

Answers (15)

cthemudo
cthemudo

Reputation: 391

If you use plotly:

import plotly
json.dumps(data, cls=plotly.utils.PlotlyJSONEncoder)

Upvotes: 1

Navjot Kashi
Navjot Kashi

Reputation: 141

Got an idea from the above answers and below code works for me,

    def convert_to_serializable(data):
        if isinstance(data, dict):
            return {key: self.convert_to_serializable(value) for key, value in data.items()}
        elif isinstance(data, list):
            return [self.convert_to_serializable(item) for item in data]
        elif isinstance(data, np.integer):
            return int(data)
        elif isinstance(data, np.floating):
            return float(data)
        else:
            return data

Upvotes: 0

kho
kho

Reputation: 1291

There are excellent answers in this post, suitable for most cases. However, I needed a solution that works for all numpy types (e.g., complex numbers) and returns json conform (i.e., comma as the list separator, non-supported types converted to strings).

Test Data

import numpy as np
import json

data = np.array([0, 1+0j, 3.123, -1, 2, -5, 10], dtype=np.complex128)
data_dict = {'value': data.real[-1], 
             'array': data.real,
             'complex_value': data[-1], 
             'complex_array': data,
             'datetime_value': data.real.astype('datetime64[D]')[0],
             'datetime_array': data.real.astype('datetime64[D]'),
           }

Solution 1: Updated NpEncoder with Decoding to numpy

JSON natively supports only strings, integers, and floats but no special (d)types such as complex or datetime. One solution is to convert those special (d)types to an array of strings with the advantage that numpy can read it back easily, as outlined in the decoder section below.

class NpEncoder(json.JSONEncoder):
    def default(self, obj):
        dtypes = (np.datetime64, np.complexfloating)
        if isinstance(obj, dtypes):
            return str(obj)
        elif isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            if any([np.issubdtype(obj.dtype, i) for i in dtypes]):
                return obj.astype(str).tolist()
            return obj.tolist()
        return super(NpEncoder, self).default(obj)

# example usage
json_str = json.dumps(data_dict, cls=NpEncoder)
# {"value": 10.0, "array": [0.0, 1.0, 3.123, -1.0, 2.0, -5.0, 10.0], "complex_value": "(10+0j)", "complex_array": ["0j", "(1+0j)", "(3.123+0j)", "(-1+0j)", "(2+0j)", "(-5+0j)", "(10+0j)"], "datetime_value": "1970-01-01", "datetime_array": ["1970-01-01", "1970-01-02", "1970-01-04", "1969-12-31", "1970-01-03", "1969-12-27", "1970-01-11"]}

Decoding to numpy

Special (d)types must be converted manually after loading the JSON.

json_data = json.loads(json_str)

# Converting the types manually
json_data['complex_value'] = complex(json_data['complex_value'])
json_data['datetime_value'] = np.datetime64(json_data['datetime_value'])

json_data['array'] = np.array(json_data['array'])
json_data['complex_array'] = np.array(json_data['complex_array']).astype(np.complex128)
json_data['datetime_array'] = np.array(json_data['datetime_array']).astype(np.datetime64)

Solution 2: Numpy.array2string

Another option is to convert numpy arrays or values to strings numpy internally, i.e.: np.array2string. This option should be pretty robust, and you can adopt the output as needed.

import sys
import numpy as np

def np_encoder(obj):
    if isinstance(obj, (np.generic, np.ndarray)):
        out = np.array2string(obj,
                              separator=',',
                              threshold=sys.maxsize,
                              precision=50,
                              floatmode='maxprec')
        # remove whitespaces and '\n'
        return out.replace(' ','').replace('\n','')

# example usage
json.dumps(data_dict, default=np_encoder)
# {"value": 10.0, "array": "[0.,1.,3.123,-1.,2.,-5.,10.]", "complex_value": "10.+0.j", "complex_array": "[0.+0.j,1.+0.j,3.123+0.j,-1.+0.j,2.+0.j,-5.+0.j,10.+0.j]", "datetime_value": "'1970-01-01'", "datetime_array": "['1970-01-01','1970-01-02','1970-01-04','1969-12-31','1970-01-03','1969-12-27','1970-01-11']"}

Comments:

  • all numpy arrays are strings ("[1,2]" vs. [1,2]) and must be read with a special decoder
  • threshold=sys.maxsize returns as many entries as possible without triggering summarization (...,).
  • With the other parameters (precision, floatmode, formatter, ...) you can adapt your output as needed.
  • For a compact JSON, I removed all whitespaces and linebreaks (.replace(' ','').replace('\n','')).

Upvotes: 3

Max Bileschi
Max Bileschi

Reputation: 2212

Here's a version that handles bools and NaN values-which are not part of JSON spec-as null.

import json
import numpy as np

class NpJsonEncoder(json.JSONEncoder):
  """Serializes numpy objects as json."""

  def default(self, obj):
    if isinstance(obj, np.integer):
      return int(obj)
    elif isinstance(obj, np.bool_):
      return bool(obj)
    elif isinstance(obj, np.floating):
      if np.isnan(obj):
        return None  # Serialized as JSON null.
      return float(obj)
    elif isinstance(obj, np.ndarray):
      return obj.tolist()
    else:
      return super().default(obj)

# Your code ... 
json.dumps(data, cls=NpEncoder)

Upvotes: 5

mapazarr
mapazarr

Reputation: 631

Actually, there is no need to write an encoder, just changing the default to str when calling the json.dumps function takes care of most types by itself so in one line of code:

json.dumps(data, default=str)

From the docs of json.dumps and json.dump: https://docs.python.org/3/library/json.html#json.dump

If specified, default should be a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a TypeError. If not specified, TypeError is raised.

So calling str converts the numpy types (such as numpy ints or numpy floats) to strings that can be parsed by json. If you have numpy arrays or ranges, they have to be converted to lists first though. In this case, writing an encoder as suggested by Jie Yang might be a more suitable solution.

Upvotes: 44

Kao-Yuan Lin
Kao-Yuan Lin

Reputation: 65

update_data = {
    'name': str(store['entity_name'].iloc[i]),
    'count__c': str(store['count'].iloc[i]) 
}

Upvotes: -1

Jie Yang
Jie Yang

Reputation: 2557

You can define your own encoder to solve this problem.

import json
import numpy as np

class NpEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        if isinstance(obj, np.floating):
            return float(obj)
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        return super(NpEncoder, self).default(obj)

# Your codes .... 
json.dumps(data, cls=NpEncoder)

Upvotes: 242

conmak
conmak

Reputation: 1470

A very simple numpy encoder can achieve similar results more generically.

Note this uses the np.generic class (which most np classes inherit from) and uses the a.item() method.

If the object to encode is not a numpy instance, then the json serializer will continue as normal. This is ideal for dictionaries with some numpy objects and some other class objects.

import json
import numpy as np

def np_encoder(object):
    if isinstance(object, np.generic):
        return object.item()

json.dumps(obj, default=np_encoder)

Upvotes: 35

Ruth
Ruth

Reputation: 1

I was able to make it work with loading the dump.

Code:

import json

json.loads(json.dumps(your_df.to_dict()))

Upvotes: -2

Expurple
Expurple

Reputation: 921

If you have control over the creation of DataFrame, you can force it to use standard Python types for values (e.g. int instead of numpy.int64) by setting dtype to object:

df = pd.DataFrame(data=some_your_data, dtype=object)

The obvious downside is that you get less performance than with primitive datatypes. But I like this solution tbh, it's really simple and eliminates all possible type problems. No need to give any hints to the ORM or json.

Upvotes: -2

Tharindu Sathischandra
Tharindu Sathischandra

Reputation: 1994

If you are going to serialize a numpy array, you can simply use ndarray.tolist() method.

From numpy docs,

a.tolist() is almost the same as list(a), except that tolist changes numpy scalars to Python scalars

In [1]: a = np.uint32([1, 2])

In [2]: type(list(a)[0])
Out[2]: numpy.uint32

In [3]: type(a.tolist()[0])
Out[3]: int

Upvotes: 17

Kostiantyn Chichko
Kostiantyn Chichko

Reputation: 19

If you have this error

TypeError: Object of type 'int64' is not JSON serializable

You can change that specific columns with int dtype to float64, as example:

df = df.astype({'col1_int':'float64', 'col2_int':'float64', etc..})

Float64 is written fine in Google Spreadsheets

Upvotes: 1

Jason R Stevens CFA
Jason R Stevens CFA

Reputation: 3091

I'll throw in my answer to the ring as a bit more stable version of @Jie Yang's excellent solution.

My solution

numpyencoder and its repository.

from numpyencoder import NumpyEncoder

numpy_data = np.array([0, 1, 2, 3])

with open(json_file, 'w') as file:
    json.dump(numpy_data, file, indent=4, sort_keys=True,
              separators=(', ', ': '), ensure_ascii=False,
              cls=NumpyEncoder)

The breakdown

If you dig into hmallen's code in the numpyencoder/numpyencoder.py file you'll see that it's very similar to @Jie Yang's answer:


class NumpyEncoder(json.JSONEncoder):
    """ Custom encoder for numpy data types """
    def default(self, obj):
        if isinstance(obj, (np.int_, np.intc, np.intp, np.int8,
                            np.int16, np.int32, np.int64, np.uint8,
                            np.uint16, np.uint32, np.uint64)):

            return int(obj)

        elif isinstance(obj, (np.float_, np.float16, np.float32, np.float64)):
            return float(obj)

        elif isinstance(obj, (np.complex_, np.complex64, np.complex128)):
            return {'real': obj.real, 'imag': obj.imag}

        elif isinstance(obj, (np.ndarray,)):
            return obj.tolist()

        elif isinstance(obj, (np.bool_)):
            return bool(obj)

        elif isinstance(obj, (np.void)): 
            return None

        return json.JSONEncoder.default(self, obj)

Upvotes: 49

shiva
shiva

Reputation: 5491

This might be the late response, but recently i got the same error. After lot of surfing this solution helped me.

def myconverter(obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        elif isinstance(obj, datetime.datetime):
            return obj.__str__()

Call myconverter in json.dumps() like below. json.dumps('message', default=myconverter)

Upvotes: 6

DYZ
DYZ

Reputation: 57033

json does not recognize NumPy data types. Convert the number to a Python int before serializing the object:

'count__c': int(store['count'].iloc[i])

Upvotes: 220

Related Questions