KaSan
KaSan

Reputation: 47

Mean Date pandas

I have a too big dataframe with x, y, year, month, how I can create a new column with the mean of days by month as INPUT

X/Y/YEAR/MONTH
1/2/1990/2
1/2/1990/5
1/2/1990/6
4/8/2020/1
4/8/2020/2
4/8/2020/3
4/8/2020/4
3/7/2020/8

I tried using conditional if per month but so doing too big the code, is there other way to do that? EXPECTED OUTPUT

X/Y/YEAR/MONTH/DAYS
1/2/1990/2/14
1/2/1990/5/15.5
1/2/1990/6/
4/8/2020/1/15.5
4/8/2020/2/14
4/8/2020/3/15.5
4/8/2020/4/15
3/7/2020/8/15.5

Upvotes: 1

Views: 39

Answers (2)

David Erickson
David Erickson

Reputation: 16683

You can create a date series d and use MonthEnd with dt.day and divide by 2:

from pandas.tseries.offsets import MonthEnd
d = pd.to_datetime(df['YEAR'].astype(str) + '-' + df['MONTH'].astype(str) + '-01')
df['Days'] = (d + MonthEnd(1)).dt.day / 2
df
Out[1]: 
   X  Y  YEAR  MONTH  Days
0  1  2  1990      2  14.0
1  1  2  1990      5  15.5
2  1  2  1990      6  15.0
3  4  8  2020      1  15.5
4  4  8  2020      2  14.5
5  4  8  2020      3  15.5
6  4  8  2020      4  15.0
7  3  7  2020      8  15.5

Upvotes: 1

roddar92
roddar92

Reputation: 366

df.groupby(['MONTH']).mean().reset_index()

Upvotes: 0

Related Questions