Reputation: 954
I've searched the pandas documentation and cookbook recipes and it's clear you can round to the nearest decimal place easily using dataframe.columnName.round(decimalplace)
.
How do you do this with larger numbers?
Example, I have a column of housing prices and I want them rounded to the nearest 10000 or 1000 or whatever.
df.SalesPrice.WhatDoIDo(1000)?
Upvotes: 13
Views: 13666
Reputation: 87
My favorite, dynamic way to do this:
ds: pd.Series to "round"
x: int/float of the power to round
# Define rounding lambda function:
my_rounder = lambda ds, x: ((ds + 0.5*10**x) // 10**x) * 10**x
# Apply lambda function to "prices" values:
housing_df["rounded_prices"] = my_rounder(housing_df["prices"], 3)
# If you need to force/ensure no decimal:
housing_df["rounded_prices"] = housing_df["rounded_prices"].apply(int)
Alternative floor rounder:
my_floor_rounder = lambda ds, x: (ds // 10**x) * 10**x
Breakdown:
print(housing_df["prices"].head())
year
2010 372560.0
2011 374507.0
2012 376454.0
2013 378401.0
2014 380348.0
Name: prices, dtype: float64
# This step can be omitted if you're finding the floor:
step_up = housing_df["prices"] + 0.5*10**3
print(step_up.head())
year
2010 373060.0
2011 375007.0
2012 376954.0
2013 378901.0
2014 380848.0
Name: prices, dtype: float64
thsnd = step_up // 10**3
print(thsnd.head())
year
2010 373.0
2011 375.0
2012 376.0
2013 378.0
2014 380.0
Name: prices, dtype: float64
rounded = thsnd * 10**3
print(rounded.head())
year
2010 373000.0
2011 375000.0
2012 376000.0
2013 378000.0
2014 380000.0
Name: prices, dtype: float64
int_rounded = rounded.apply(int)
print(int_rounded.head())
year
2010 373000
2011 375000
2012 376000
2013 378000
2014 380000
Name: prices, dtype: int64
Upvotes: 0
Reputation: 5109
Another interesting "hack" is this: Let's say you want to round off to the nearest 100s. You can add 50, then divide by 100, convert to integer, multiply back by 100.
df = pd.DataFrame({'val':[1005,1299,1301,4109]})
df.val.round(-2) # Proper way
((df.val+50)/100).astype(int)*100 # Hack
Gives you this, as desired:
[1000, 1300, 1300, 4100]
Upvotes: 1
Reputation: 3077
Function round does accept negative values for cases in which you want to specify precision to the left of the decimal point:
dataframe.columnName.round(-3)
Example:
>>> pd.Series([1, 500, 500.1, 999, 1500, 1501, 946546]).round(-3)
0 0.0
1 0.0
2 1000.0
3 1000.0
4 2000.0
5 2000.0
6 947000.0
dtype: float64
Upvotes: 3
Reputation: 4623
By using the notation df.ColumnName.round()
, you are actually calling pandas.Series.round
, the documentation of which specifies:
decimals : int
Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.
So you can do:
df = pd.DataFrame({'val':[1,11,130,670]})
df.val.round(decimals=-2)
This produces the output:
0 0
1 0
2 100
3 700
Name: val, dtype: int64
decimals=-3
rounds to the 1000s, and so on. Notably, it also works using pandas.DataFrame.round()
, though the documentation doesn't tell you:
df = pd.DataFrame({'val':[1,11,130,670], 'x':[1,11,150,900]})
df.round({'val':-2})
This will round the column val
to the nearest 100, but leave x
alone.
Upvotes: 18
Reputation: 323386
You can try this
df = pd.DataFrame({'val':[1,11,130,670]})
10**df.val.astype(str).str.len()
Out[27]:
0 10
1 100
2 1000
3 1000
Name: val, dtype: int64
Upvotes: 1