Rick
Rick

Reputation: 45261

Integer format specification 'd' produces ValueError when applied row by row to numpy.int column in pandas DataFrame

Say I create a pandas dataframe containing both int and float types:

>>> df=pd.DataFrame([[1, 1.3], [2, 2.4]], columns=['a', 'b'])
>>> df
   a    b
0  1  1.3
1  2  2.4

It is clear that column 'a' is composed of numpy.int64 values:

>>> df.a.dtype
dtype('int64')
>>> df.a[0]
1
>>> type(df.a[0])
<class 'numpy.int64'>

...and I can use the d formatting specifier to format these column 'a' values:

>>> "{a:d}".format(a=df.a[0])
'1'

However, if I try to apply the same formatting row by row, I get this error that says the values in column 'a' are floats and not ints:

>>> df.apply(lambda s: "{a:d}{b:f}".format(**s), axis=1)
Traceback (most recent call last):
...
ValueError: ("Unknown format code 'd' for object of type 'float'", 'occurred at index 0')

What is happening here?

Upvotes: 2

Views: 352

Answers (2)

ansev
ansev

Reputation: 30920

The apply method treats the values ​​as floating when there are int and float values ​​in the columns / rows.

df.apply(lambda x: ( type(x['a']),type(x['b']) ),axis=1)
0    (<class 'numpy.float64'>, <class 'numpy.float6...
1    (<class 'numpy.float64'>, <class 'numpy.float6...
dtype: object

To avoid this, you can change the type of the dataframe to object with DataFrame.astype

df.astype(object).apply(lambda s: "{a:d}{b:f}".format(**s.astype(int)), axis=1)
0    11.000000
1    22.000000
dtype: object

df.astype(object).apply(lambda x: ( type(x['a']),type(x['b']) ),axis=1)
0    (<class 'int'>, <class 'float'>)
1    (<class 'int'>, <class 'float'>)
dtype: object

Upvotes: 1

BENY
BENY

Reputation: 323306

Let us fix it by

df.apply(lambda s: "{a:.0f}{b:f}".format(**s), axis=1)
0    11.300000
1    22.400000
dtype: object

Upvotes: 3

Related Questions