Reputation: 345
I want to get the mean value for every 7 rows of a column in a file excluding from the mean specific values (i.e.-9999 and 0)
a sample of the file is
1 567
2 -9999
3 98
4 77
5 0
6 89
7 77
8 567
9 -9999
10 87
11 0
12 10
13 79
14 75
I get the mean value for every 7 rows as following:
df=pd.read_csv(e,sep='\t',usecols=[1],skiprows=(0),header=None)
data_mean=df.groupby(np.arange(len(df))//7).mean()
data_mean.index=data_week.index+1
data_mean.to_csv(outfile,sep="\t",header=False)
In case I would like only to get the mean value of the whole column and use a mask I would do
data_mean = df.mask(df.isin([-9999, 0])).mean(1) #to exclude 0 and -9999
How I could use the mask in the first code?
Upvotes: 0
Views: 164
Reputation: 8778
Try this:
df.mask(df.isin([-9999, 0])).groupby(np.arange(len(df.index))//7).mean()
Upvotes: 1
Reputation: 120499
You can replace -9999 and 0 by NaN:
df[df.isin([-9999, 0])] = np.NaN
then apply your first code:
>>> df.groupby(np.arange(len(df))//7).mean()
1
0 181.6
1 163.6
Upvotes: 1