Kovac
Kovac

Reputation: 21

Convert a Pandas DataFrame to a dictionary avg rain per month

I have to create a dictionary where the keys are months and the values are avg rain that month for a year (12 months) from this dataframe: https://i.sstatic.net/Qh9T0.png.

The end result (from a function called rain_months()) should look like this:

Input:

Rain_months()

Output:

{
    1: 102.5,
    2: 69.2,
    3: 80.0,
    4: 30.7,
    5: 70.3,
    6: 20.4,
    7: 10.5,
    8: 50.0,
    9: 78.7,
    10: 102.2,
    11: 90.5,
    12: 80.8,
}

How would I go about doing this the easiest way?

Upvotes: 1

Views: 112

Answers (2)

fischmalte
fischmalte

Reputation: 81

To get the average per month I first created a column which has a uniqe year-month identifier:

import pandas as pd
from datetime import datetime

def datetime_to_year_month(date: datetime) -> str:
    year = str(datetime.fromisoformat(date).year) 
    month = str(datetime.fromisoformat(date).strftime("%b"))
    return '-'.join([year, month])

df['Year-Month'] = df['Dygn'].apply(datetime_to_year_month)

Next, you can simply use pandas groupby method:

avg_df = df.groupby('Year-Month').mean()
result_df = avg_df.rename({"Nederbörd(mm)": "Average Neerbörd(mm)"}, axis=1)

Now convert it to a dictionary:

idx = range(0, len(result_df))
d = {number+1: result_df["Average Neerbörd (mm)"][number] for number in idx}

Upvotes: 0

EBDS
EBDS

Reputation: 1724

You didn't provide the data. I create my own to look like the one in your image.

daterange = pd.date_range('2021-01-01','2021-12-31',freq='D')
daterange
rainfall = np.random.rand(365)* 100
rainfall
df = pd.DataFrame({'StartUTC': daterange,'SlutUTC': daterange, 'Date': daterange,'Rain': rainfall})
df

Output

StartUTC    SlutUTC         Date        Rain
0   2021-01-01  2021-01-01  2021-01-01  69.711979
1   2021-01-02  2021-01-02  2021-01-02  22.747788
2   2021-01-03  2021-01-03  2021-01-03  54.689692
3   2021-01-04  2021-01-04  2021-01-04  67.699379
4   2021-01-05  2021-01-05  2021-01-05  17.526345
... ... ... ... ...
360 2021-12-27  2021-12-27  2021-12-27  30.267470
361 2021-12-28  2021-12-28  2021-12-28  76.786577
362 2021-12-29  2021-12-29  2021-12-29  29.693561
363 2021-12-30  2021-12-30  2021-12-30  61.648194
364 2021-12-31  2021-12-31  2021-12-31  83.260911
365 rows × 4 columns

In order not to change your original DataFrame, I create a new one with the 2 columns:

def rain_month(df):
    df1 = pd.concat([df['Date'],df['Rain']],axis=1)
    df1['Month'] = pd.to_datetime(df1.Date).dt.month
    df1 = df1.groupby('Month').sum()
    df1.reset_index()
    return df1.to_dict()['Rain']
    
rain_month(df)

Output

{1: 1727.5077622983708,
 2: 1148.8897677989883,
 3: 1232.8023689499066,
 4: 1766.9731141830662,
 5: 1699.0852472560787,
 6: 1427.3383187366533,
 7: 1813.864397566138,
 8: 1374.2262197672233,
 9: 1629.9294176340234,
 10: 1810.487273457173,
 11: 1708.2878840145643,
 12: 1594.6048147191548}

Upvotes: 1

Related Questions