sbiner
sbiner

Reputation: 71

pandas resample dealing with missing data

I am using pandas to deal with monthly data that have some missing value. I would like to be able to use the resample method to compute annual statistics but for years with no missing data.

Here is some code and output to demonstrate :

import pandas as pd
import numpy as np
dates = pd.date_range(start = '1980-01', periods = 24,freq='M')
df = pd.DataFrame( [np.nan] * 10 + range(14), index = dates)

Here is what I obtain if I resample :

In [18]: df.resample('A')
Out[18]: 
          0
1980-12-31  0.5
1981-12-31  7.5

I would like to have a np.nan for the 1980-12-31 index since that year does not have monthly values for every month. I tried to play with the 'how' argument but to no luck.

How can I accomplish this?

Upvotes: 5

Views: 2161

Answers (1)

acushner
acushner

Reputation: 9946

i'm sure there's a better way, but in this case you can use:

df.resample('A', how=[np.mean, pd.Series.count, len])

and then drop all rows where count != len

Upvotes: 2

Related Questions