Reputation: 45261
Say I create a pandas dataframe containing both int
and float
types:
>>> df=pd.DataFrame([[1, 1.3], [2, 2.4]], columns=['a', 'b'])
>>> df
a b
0 1 1.3
1 2 2.4
It is clear that column 'a'
is composed of numpy.int64
values:
>>> df.a.dtype
dtype('int64')
>>> df.a[0]
1
>>> type(df.a[0])
<class 'numpy.int64'>
...and I can use the d
formatting specifier to format these column 'a'
values:
>>> "{a:d}".format(a=df.a[0])
'1'
However, if I try to apply the same formatting row by row, I get this error that says the values in column 'a'
are floats and not ints:
>>> df.apply(lambda s: "{a:d}{b:f}".format(**s), axis=1)
Traceback (most recent call last):
...
ValueError: ("Unknown format code 'd' for object of type 'float'", 'occurred at index 0')
What is happening here?
Upvotes: 2
Views: 352
Reputation: 30920
The apply method treats the values as floating when there are int and float values in the columns / rows.
df.apply(lambda x: ( type(x['a']),type(x['b']) ),axis=1)
0 (<class 'numpy.float64'>, <class 'numpy.float6...
1 (<class 'numpy.float64'>, <class 'numpy.float6...
dtype: object
To avoid this, you can change the type of the dataframe to object with DataFrame.astype
df.astype(object).apply(lambda s: "{a:d}{b:f}".format(**s.astype(int)), axis=1)
0 11.000000
1 22.000000
dtype: object
df.astype(object).apply(lambda x: ( type(x['a']),type(x['b']) ),axis=1)
0 (<class 'int'>, <class 'float'>)
1 (<class 'int'>, <class 'float'>)
dtype: object
Upvotes: 1
Reputation: 323306
Let us fix it by
df.apply(lambda s: "{a:.0f}{b:f}".format(**s), axis=1)
0 11.300000
1 22.400000
dtype: object
Upvotes: 3