Jaffer Wilson
Jaffer Wilson

Reputation: 7273

object type columns to float32 type using Pandas python 35

Here is what I have tried and the result with error:

df.info()
df[['volume', 'open', 'high', 'low', 'close']] =pd.Series( df[['volume', 'open', 'high', 'low', 'close']], dtype='float32')

Output with Error:

<class 'pandas.core.frame.DataFrame'>
Index: 4999 entries, 2018-06-01T00:01:00.000000000Z to 2018-06-06T14:20:00.000000000Z
Data columns (total 6 columns):
volume      4999 non-null object
close       4999 non-null object
high        4999 non-null object
low         4999 non-null object
open        4999 non-null object
complete    4999 non-null object
dtypes: object(6)
memory usage: 273.4+ KB
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
c:\python35\lib\site-packages\pandas\core\common.py in _asarray_tuplesafe(values, dtype)
    398                 result = np.empty(len(values), dtype=object)
--> 399                 result[:] = values
    400             except ValueError:

ValueError: could not broadcast input array from shape (4999,5) into shape (4999)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-23-b205607bf5cf> in <module>()
      1 df[['volume', 'open', 'high', 'low', 'close']].iloc[50:60] #, 'complete'
      2 df.info()
----> 3 df[['volume', 'open', 'high', 'low', 'close']] =pd.Series( df[['volume', 'open', 'high', 'low', 'close']], dtype='float32')
      4 # df = pd.to_numeric(df, errors='ignore')

c:\python35\lib\site-packages\pandas\core\series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    262             else:
    263                 data = _sanitize_array(data, index, dtype, copy,
--> 264                                        raise_cast_failure=True)
    265 
    266                 data = SingleBlockManager(data, index, fastpath=True)

c:\python35\lib\site-packages\pandas\core\series.py in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
   3275             raise Exception('Data must be 1-dimensional')
   3276         else:
-> 3277             subarr = _asarray_tuplesafe(data, dtype=dtype)
   3278 
   3279     # This is to prevent mixed-type Series getting all casted to

c:\python35\lib\site-packages\pandas\core\common.py in _asarray_tuplesafe(values, dtype)
    400             except ValueError:
    401                 # we have a list-of-list
--> 402                 result[:] = [tuple(x) for x in values]
    403 
    404     return result

ValueError: cannot copy sequence with size 5 to array axis with dimension 4999

Kindly, let me know what exactly I can do for the conversion.

Upvotes: 1

Views: 87

Answers (1)

jezrael
jezrael

Reputation: 862481

You can use astype with subset of columns:

df = pd.DataFrame({'A':list('abcdef'),
                   'low':[4,5,4,5,5,4],
                   'high':[7,8,9,4,2,3],
                   'open':[1,3,5,7,1,0],
                   'volume':[5,3,6,9,2,4],
                   'close':[5,3,6,9,2,4],
                   'F':list('aaabbb')}).astype(str)

print (df.dtypes)
A         object
low       object
high      object
open      object
volume    object
close     object
F         object
dtype: object

cols = ['volume', 'open', 'high', 'low', 'close']
df[cols] = df[cols].astype(np.float32)

print (df.dtypes)
A          object
low       float32
high      float32
open      float32
volume    float32
close     float32
F          object
dtype: object

Upvotes: 1

Related Questions