Reputation: 192
I want to round numbers in one column from the number of decimal places (not all the same) shown in another column using Pandas.
My data
numbers decimal
1.2345 2
2.3456 3
3.4567 2
Expected output:
numbers decimal newcolA
0 1.2345 2 1.23
1 2.3456 3 2.346
2 3.4567 2 3.46
My code #1
import pandas as pd
df = pd.DataFrame(
data = {
'numbers' : [1.2345, 2.3456, 3.4567],
'decimal' : [2,3,2]
}
)
df['newcolA'] = round(df['numbers'] , df['decimal'])
I get the following error: TypeError: cannot convert the series to <class 'int'>
However, the following similar code works
My code #2
import pandas as pd
df = pd.DataFrame(
data = {
'numbers' : [1.2345, 2.3456, 3.4567],
'decimal' : [2,3,2]
}
)
df['newcolB'] = df['numbers']*df['decimal'] #The only difference
df
numbers decimal newcolB
0 1.2345 2 2.4690
1 2.3456 3 7.0368
2 3.4567 2 6.9134
What am I not understanding? Why code 2 works, but not the first
Upvotes: 1
Views: 231
Reputation: 192
I have adapted the solution of @Corralien so as not to introduce any superfluous 0 (the numbers are all non-integer reals numbers greater than 1, so I believe my noZero function is correct)
import pandas as pd
from math import log10, floor
def noZero(x, rd): # rd digits at right
ld = floor(log10(x))+1 # ld digits at left
x = round(x,rd)
fmt = "{:<0" + str(ld + 1 + rd) + "." + str(rd) + "f}"
x_str = fmt.format(x)
return x_str
df = pd.DataFrame(
data = {
'numbers' : [1.2345, 2.3456, 3.4567, 3.399],
'decimal' : [2, 3, 2, 2]
}
)
df['newcolA'] = df.apply(lambda x: round(x['numbers'], int(x['decimal'])), axis=1)
df['happy :)'] = df.apply(lambda x: noZero(x['newcolA'], int(x['decimal'])), axis=1)
df
numbers decimal newcolA happy :)
0 1.2345 2 1.230 1.23
1 2.3456 3 2.346 2.346
2 3.4567 2 3.460 3.46
3 3.3990 2 3.400 3.40
Upvotes: 0
Reputation: 38415
Since your question is more about understanding two different behaviors, will focus on that.
Code #1
df['newcolA'] = round(df['numbers'] , df['decimal'])
df['numbers'] & df['decimal'] are of type Series
. Effectively, you are passing a Series to round
but it expects a number. Hence the error: TypeError: cannot convert the series to <class 'int'>
Code #2
df['numbers']*df['decimal']
Pandas allows various operations between two series of same length using vectorized operations.
Solution
There are multiple possible solutions, the most idiomatic would be to use apply
(already posted by @Corralien)
Upvotes: 1
Reputation: 120479
Try:
>>> df.apply(lambda x: round(x['numbers'], int(x['decimal'])), axis=1)
0 1.230
1 2.346
2 3.460
dtype: float64
Upvotes: 2