Sergey Ronin
Sergey Ronin

Reputation: 776

Modify and round numbers in a pandas dataframe in Python

Long story short, I have a csv file which I read as a pandas dataframe. The file contains a weather report, but all of the measurements for temperature are in Fahrenheit. I've figured out how to convert them:

import pandas as np

df = np.read_csv('report.csv')
df['average temperature'] = (df['average temperature'] - 32) * 5/9

But then the data for this column is in decimals up to 6 points. I've found code that will round up all the data in the dataframe, but I need only this column.

df.round(2)

I don't like how it has to be a separate piece of code on a separate line and how it modifies all of my data. Is there a way to go about this problem more elegantly? Is there a way to apply this to other columns in my dataframe, such as maximum temperature and minimum temperature without having to copy the above piece of code?

Upvotes: 2

Views: 741

Answers (2)

jezrael
jezrael

Reputation: 863166

For round only some columns use subset:

cols = ['maximum temperature','minimum temperature','average temperature']
df[cols] = df[cols].round(2)

If want convert only some columns from list:

cols = ['maximum temperature','minimum temperature','average temperature']
df[cols] = ((df[cols] - 32) * 5/9).round(2)

If want round each column separately:

df['average temperature'] = df['average temperature'].round(2)
df['maximum temperature'] = df['maximum temperature'].round(2)
df['minimum temperature'] = df['minimum temperature'].round(2)

Sample:

df = (pd.DataFrame(np.random.randint(30, 100, (10, 3)),
                 columns=['maximum temperature','minimum temperature','average temperature'])
                  .assign(a='m', b=range(10)))
print (df)
   maximum temperature  minimum temperature  average temperature  a  b
0                   97                   60                   98  m  0
1                   64                   86                   64  m  1
2                   32                   64                   95  m  2
3                   60                   56                   93  m  3
4                   43                   89                   64  m  4
5                   40                   62                   86  m  5
6                   37                   40                   70  m  6
7                   61                   33                   46  m  7
8                   36                   44                   46  m  8
9                   63                   30                   33  m  9

cols = ['maximum temperature','minimum temperature','average temperature']
df[cols] = ((df[cols] - 32) * 5/9).round(2)
print (df)
   maximum temperature  minimum temperature  average temperature  a  b
0                36.11                15.56                36.67  m  0
1                17.78                30.00                17.78  m  1
2                 0.00                17.78                35.00  m  2
3                15.56                13.33                33.89  m  3
4                 6.11                31.67                17.78  m  4
5                 4.44                16.67                30.00  m  5
6                 2.78                 4.44                21.11  m  6
7                16.11                 0.56                 7.78  m  7
8                 2.22                 6.67                 7.78  m  8
9                17.22                -1.11                 0.56  m  9

Upvotes: 2

wij
wij

Reputation: 1302

Here's a single line solution with apply and a conversion function.

def convert_to_celsius (f):
    return 5.0/9.0*(f-32)

df[['Column A','Column B']] = df[['Column A','Column B']].apply(convert_to_celsius).round(2)

Upvotes: 0

Related Questions