Dylaloo
Dylaloo

Reputation: 13

Is it possible to fix 'ValueError: cannot convert float NaN to integer' error without removing the NaN values?

so my problem lies in preparing a DataFrame for creating a heatmap using pandas and seaborn. My question is if there is to keep the NaN values as NaN while converting everything from an object to an integer so I can plot it doing something like sns.heatmap(df, mask = df.isnull())

What I am doing so far is entering data into a new DataFrame that I created that looks like this (https://i.sstatic.net/hS4xX.jpg) upon creation.

From there I insert the values into the new DataFrame using code that looks like:

start = 16
end = start + 10
dates = range(start,end)
for d in dates:
    str(d)
    for i, row in jfk10day.iterrows():
        row[f'Apr/{d}/2019'] = jfk[jfk['Pick-up Date'] == f'Apr/{d}/2019'][jfk['Supplier']==i][jfk['Car Type'] == 'Compact']['Total Price'].min()

Which enters the data into the dataframe as type object. This completed dataframe looks like https://i.sstatic.net/oQXen.jpg.

Now from here I know that I need to change the datatype to int/float in order to plot it using sns.heatmap(), but when I try something like:

jfk10day = jfk10day.astype(int)

I get the error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-76-45dab2567d52> in <module>
----> 1 jfk10day.astype(int)

/anaconda3/lib/python3.7/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    176                 else:
    177                     kwargs[new_arg_name] = new_arg_value
--> 178             return func(*args, **kwargs)
    179         return wrapper
    180     return _deprecate_kwarg

/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in astype(self, dtype, copy, errors, **kwargs)
   4999             # else, only a single dtype is given
   5000             new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors,
-> 5001                                          **kwargs)
   5002             return self._constructor(new_data).__finalize__(self)
   5003 

/anaconda3/lib/python3.7/site-packages/pandas/core/internals.py in astype(self, dtype, **kwargs)
   3712 
   3713     def astype(self, dtype, **kwargs):
-> 3714         return self.apply('astype', dtype=dtype, **kwargs)
   3715 
   3716     def convert(self, **kwargs):

/anaconda3/lib/python3.7/site-packages/pandas/core/internals.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
   3579 
   3580             kwargs['mgr'] = self
-> 3581             applied = getattr(b, f)(**kwargs)
   3582             result_blocks = _extend_blocks(applied, result_blocks)
   3583 

/anaconda3/lib/python3.7/site-packages/pandas/core/internals.py in astype(self, dtype, copy, errors, values, **kwargs)
    573     def astype(self, dtype, copy=False, errors='raise', values=None, **kwargs):
    574         return self._astype(dtype, copy=copy, errors=errors, values=values,
--> 575                             **kwargs)
    576 
    577     def _astype(self, dtype, copy=False, errors='raise', values=None,

/anaconda3/lib/python3.7/site-packages/pandas/core/internals.py in _astype(self, dtype, copy, errors, values, klass, mgr, **kwargs)
    662 
    663                 # _astype_nansafe works fine with 1-d only
--> 664                 values = astype_nansafe(values.ravel(), dtype, copy=True)
    665                 values = values.reshape(self.shape)
    666 

/anaconda3/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy)
    707         # work around NumPy brokenness, #1987
    708         if np.issubdtype(dtype.type, np.integer):
--> 709             return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
    710 
    711         # if we have a datetime/timedelta array of objects

pandas/_libs/lib.pyx in pandas._libs.lib.astype_intsafe()

pandas/_libs/src/util.pxd in util.set_value_at_unsafe()

ValueError: cannot convert float NaN to integer

So I am wondering if there is a way to edit my for loop so that every entry is entered as an int (the original dataframe 'Total Price' is already int), or if there is a way to convert the new dataframe to type int while skipping over the NaN values. I need the NaN values in the heatmap to show that the supplier is not offering anything on that specific date.

Thanks in advance for the help guys, and if there is any more information needed from me please let me know!

Upvotes: 0

Views: 1064

Answers (1)

Erfan
Erfan

Reputation: 42886

Since pandas version 0.24.0 we have nullable integer data type:

df = pd.DataFrame({'Col':[1.0, 2.0, 3.0, np.NaN]})
print(df)

   Col
0  1.0
1  2.0
2  3.0
3  NaN 

print(df.Col.astype('Int64'))

0      1
1      2
2      3
3    NaN
Name: Col, dtype: Int64

Upvotes: 2

Related Questions