Reputation: 14415
I am trying to find a clever, straightforward way to round a column of integer data given some set of values. Here is my example data
import pandas as pd
df = pd.Series([11,16,21, 125])
Say I want to round up to to the nearest custom list of values using an undefined function called custom_round
. Such that when a rounding is applied, each value in the original series is rounded up to the nearest value, or if the are already larger than the maximum, they are brought down to that maximum.
rounding_logic = pd.Series([15, 20, 100])
df = pd.Series([11,16,21]).apply(lambda x: custom_round(x, round_list=rounding_logic))
Afterwards df is now pd.Series([15, 20, 100, 100])
.
I can think of some horribly inefficient and ugly if else statements, but I suspect there must be a far easier approach. I have seen custom rounding answers where the objective is to round to the nearest n, given some base number e.g. Pandas round to the nearest "n" but nothing that solves my particular case.
Upvotes: 1
Views: 1481
Reputation: 17824
You can use the function cut
:
df = pd.Series([11,16,21, 125])
rounding_logic = pd.Series([15, 20, 100])
labels = rounding_logic.tolist()
rounding_logic = pd.Series([-np.inf]).append(rounding_logic) # add infinity as leftmost edge
pd.cut(df, rounding_logic, labels=labels).fillna(rounding_logic.iloc[-1])
# get bins, assign labels and fill all values that greater than 100 with 100
Output:
0 15
1 20
2 100
3 100
dtype: category
Categories (3, int64): [15 < 20 < 100]
Upvotes: 3
Reputation: 1381
it still has to be a bit messy, because of "if its larger than maximum" case:
def custom_round(col, rounding_vals = np.array([15, 20, 100])):
if isinstance(col, pd.Series):
col = col.to_numpy()
return rounding_vals[np.argmax(np.concatenate([col[:, None] < rounding_vals[None, :-1], np.ones((len(col), 1), dtype=bool)], axis=1), axis=1)]
Upvotes: 1