Nat
Nat

Reputation: 345

How to mask values when grouping rows with pandas?

I want to get the mean value for every 7 rows of a column in a file excluding from the mean specific values (i.e.-9999 and 0)

a sample of the file is

1 567
2 -9999
3 98
4 77
5 0 
6 89
7 77
8 567
9 -9999
10 87
11 0
12 10 
13 79
14 75

I get the mean value for every 7 rows as following:

df=pd.read_csv(e,sep='\t',usecols=[1],skiprows=(0),header=None)
data_mean=df.groupby(np.arange(len(df))//7).mean() 
data_mean.index=data_week.index+1
data_mean.to_csv(outfile,sep="\t",header=False)

In case I would like only to get the mean value of the whole column and use a mask I would do

data_mean = df.mask(df.isin([-9999, 0])).mean(1) #to exclude 0 and -9999 

How I could use the mask in the first code?

Upvotes: 0

Views: 164

Answers (2)

rhug123
rhug123

Reputation: 8778

Try this:

df.mask(df.isin([-9999, 0])).groupby(np.arange(len(df.index))//7).mean()

Upvotes: 1

Corralien
Corralien

Reputation: 120499

You can replace -9999 and 0 by NaN:

df[df.isin([-9999, 0])] = np.NaN

then apply your first code:

>>> df.groupby(np.arange(len(df))//7).mean()
       1
0  181.6
1  163.6

Upvotes: 1

Related Questions