hernanavella
hernanavella

Reputation: 5552

How to round pandas dataframe to fractions?

Given

np.random.seed(1234)
df = pd.DataFrame({'A' : range(10), 'B' : np.random.randn(10), 'C' : np.random.randn(10)})

How to round columns B, C to the nearest 0.25. This is what I tried:

def roundPartial (value, resolution):
    return round (value / resolution) * resolution
df[['B', 'C']].apply(roundPartial, 0.25)

But I get:

ValueError: No axis named 0.25 for object type <class 'pandas.core.frame.DataFrame'>

Upvotes: 2

Views: 928

Answers (1)

jezrael
jezrael

Reputation: 863226

If you need apply function roundPartial with arguments, you can use lambda:

def roundPartial (value, resolution):
    return round (value / resolution) * resolution
print (df[['B', 'C']].apply(lambda x: roundPartial(x, 0.25)))
      B     C
0  0.50  1.25
1 -1.25  1.00
2  1.50  1.00
3 -0.25 -2.00
4 -0.75 -0.25
5  1.00  0.00
6  0.75  0.50
7 -0.75  0.25
8  0.00  1.25
9 -2.25 -1.50

Another solution with round:

print (df[['B', 'C']].apply(lambda x: (x / 0.25).round()* 0.25))

      B     C
0  0.50  1.25
1 -1.25  1.00
2  1.50  1.00
3 -0.25 -2.00
4 -0.75 -0.25
5  1.00  0.00
6  0.75  0.50
7 -0.75  0.25
8  0.00  1.25
9 -2.25 -1.50

But the fastest in larger DataFrame is not use apply, you can divide by div all DataFrame by resolution and multiple by mul:

resolution = 0.25
print ((df[['B', 'C']].div(resolution)).round().mul(resolution))
#print ((df[['B', 'C']] / resolution).round() * resolution)    

      B     C
0  0.50  1.25
1 -1.25  1.00
2  1.50  1.00
3 -0.25 -2.00
4 -0.75 -0.25
5  1.00  0.00
6  0.75  0.50
7 -0.75  0.25
8  0.00  1.25
9 -2.25 -1.50

Timings:len(df)=100k:

df = pd.concat([df]*10000).reset_index(drop=True)

In [125]: %timeit (df[['B', 'C']].apply(lambda x: (x / resolution).round()* resolution))
10 loops, best of 3: 29 ms per loop

In [126]: %timeit ((df[['B', 'C']] / resolution).round() * resolution)
10 loops, best of 3: 22.5 ms per loop

In [127]: %timeit ((df[['B', 'C']].div(resolution)).round().mul(resolution))
10 loops, best of 3: 22.6 ms per loop

Upvotes: 2

Related Questions