LeroxXx
LeroxXx

Reputation: 15

Mean of columns with ddply without considering 0 values

so I have a data frame with Dates and Values as columns. I found a code that calculates the mean of all Values with the same Date.

MeanValues = ddply(df, .(Date), summarize, Values = mean(Values))

Now my problem is that it considers 0 values, which are basically values that are NA, is there easy way to modify this code, to exclude 0 or NA values?

I appreciate that you take your time to help me, thank you.

Upvotes: 1

Views: 1241

Answers (1)

Spacedman
Spacedman

Reputation: 94267

Let's create a sample data:

df = structure(list(Date = structure(c(17115, 17116, 17115, 17115, 
17115, 17115, 17115, 17116, 17115, 17116), class = "Date"), Values = c(12, 
NA, 13, 15, 18, 14, 17, 11, 20, 19)), .Names = c("Date", "Values"
), row.names = c(NA, -10L), class = "data.frame")

Just filter out the zeroes in some way, such as:

> MeanValues = ddply(df, .(Date), summarize, Values = mean(Values[Values>0]))

but probably better to replace the 0 with NA at an earlier stage, then use na.rm=TRUE in the mean call.

> df$Values[df$Values==0]=NA

> MeanValues = ddply(df, .(Date), summarize, Values = mean(Values,na.rm=TRUE))
> MeanValues
        Date   Values
1 2016-11-10 15.57143
2 2016-11-11 15.00000
> 

Upvotes: 1

Related Questions