Mujeebur Rahman
Mujeebur Rahman

Reputation: 196

How to change DataFrame column values so that mean is modified accordingly?

I have a Pandas DataFrame extracted from Estespark Weather for the dates between Sep-2009 and Oct-2018, and the mean of the Average windspeed column is 4.65. I am taking a challenge where there is a sanity check where the mean of this column needed to be 4.64. How can I modify the values of this column so that the mean of this column becomes 4.64? Is there any code solution for this, or do we have to do it manually?

Upvotes: 0

Views: 33

Answers (1)

motrix
motrix

Reputation: 368

I can see two solutions:

    1. Substract 0.01 (4.65 - 4.64) to every value of that column like:
df['AvgWS'] -= 0.01
  • 2 If you dont want to alter all rows: find wich rows you can remove to give you the desired mean (if there are any):
current_mean = 4.65
desired_mean = 4.64
n_rows = len(df['AvgWS'])

df['can_remove'] = df['AvgWS'].map(lambda x: (current_mean*n_rows - x)/(n_rows-1) ==  4.64)

This will create a new boolean column in your dataframe with True in the rows that, if removed, make the rest of the column's mean = 4.64. If there are more than one you can analyse them to choose which one seems less important and then remove that one.

Upvotes: 1

Related Questions