Angelo
Angelo

Reputation: 954

Round to nearest 1000 in pandas

I've searched the pandas documentation and cookbook recipes and it's clear you can round to the nearest decimal place easily using dataframe.columnName.round(decimalplace).

How do you do this with larger numbers?

Example, I have a column of housing prices and I want them rounded to the nearest 10000 or 1000 or whatever.

df.SalesPrice.WhatDoIDo(1000)? 

Upvotes: 13

Views: 13666

Answers (5)

JGarcia
JGarcia

Reputation: 87

My favorite, dynamic way to do this:

ds: pd.Series to "round"
x: int/float of the power to round

# Define rounding lambda function:
my_rounder = lambda ds, x: ((ds + 0.5*10**x) // 10**x) * 10**x

# Apply lambda function to "prices" values:
housing_df["rounded_prices"] = my_rounder(housing_df["prices"], 3)

# If you need to force/ensure no decimal:
housing_df["rounded_prices"] = housing_df["rounded_prices"].apply(int)

Alternative floor rounder:

my_floor_rounder = lambda ds, x: (ds // 10**x) * 10**x

Breakdown:

print(housing_df["prices"].head())

year
2010    372560.0
2011    374507.0
2012    376454.0
2013    378401.0
2014    380348.0
Name: prices, dtype: float64
    
# This step can be omitted if you're finding the floor:
step_up = housing_df["prices"] + 0.5*10**3
print(step_up.head())

year
2010    373060.0
2011    375007.0
2012    376954.0
2013    378901.0
2014    380848.0
Name: prices, dtype: float64

thsnd = step_up // 10**3
print(thsnd.head())

year
2010    373.0
2011    375.0
2012    376.0
2013    378.0
2014    380.0
Name: prices, dtype: float64

rounded = thsnd * 10**3
print(rounded.head())

year
2010    373000.0
2011    375000.0
2012    376000.0
2013    378000.0
2014    380000.0
Name: prices, dtype: float64

int_rounded = rounded.apply(int)
print(int_rounded.head())

year
2010    373000
2011    375000
2012    376000
2013    378000
2014    380000
Name: prices, dtype: int64

Upvotes: 0

FatihAkici
FatihAkici

Reputation: 5109

Another interesting "hack" is this: Let's say you want to round off to the nearest 100s. You can add 50, then divide by 100, convert to integer, multiply back by 100.

df = pd.DataFrame({'val':[1005,1299,1301,4109]})
df.val.round(-2) # Proper way
((df.val+50)/100).astype(int)*100 # Hack

Gives you this, as desired:

[1000, 1300, 1300, 4100]

Upvotes: 1

grovina
grovina

Reputation: 3077

Function round does accept negative values for cases in which you want to specify precision to the left of the decimal point:

dataframe.columnName.round(-3)

Example:

>>> pd.Series([1, 500, 500.1, 999, 1500, 1501, 946546]).round(-3)
0         0.0
1         0.0
2      1000.0
3      1000.0
4      2000.0
5      2000.0
6    947000.0
dtype: float64

Upvotes: 3

PaSTE
PaSTE

Reputation: 4623

By using the notation df.ColumnName.round(), you are actually calling pandas.Series.round, the documentation of which specifies:

decimals : int

Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.

So you can do:

df = pd.DataFrame({'val':[1,11,130,670]})
df.val.round(decimals=-2)

This produces the output:

0      0
1      0
2    100
3    700
Name: val, dtype: int64

decimals=-3 rounds to the 1000s, and so on. Notably, it also works using pandas.DataFrame.round(), though the documentation doesn't tell you:

df = pd.DataFrame({'val':[1,11,130,670], 'x':[1,11,150,900]})
df.round({'val':-2})

This will round the column val to the nearest 100, but leave x alone.

Upvotes: 18

BENY
BENY

Reputation: 323386

You can try this

df = pd.DataFrame({'val':[1,11,130,670]})
10**df.val.astype(str).str.len()
Out[27]: 
0      10
1     100
2    1000
3    1000
Name: val, dtype: int64

Upvotes: 1

Related Questions