datakritter
datakritter

Reputation: 600

Finding "median" without averaging middle values

I want to extract an actual date from my dataset that is more or less in the middle. median() works fine if I have an odd number of measurements. However, if there are an even number of measurements, it averages the middle two, which produces a date that isn't in my dataset.

For the following example:

mydates <- as.Date(c("2016-02-18", "2016-03-30", "2016-05-31", "2016-08-19"))
median(mydates)

...what can I do to get R to return either "2016-03-30" or "2016-05-31", instead of "2016-4-30", which isn't in my original data set?

I don't care if it is the earlier or the latter date as long as it is consistent.

Upvotes: 0

Views: 1673

Answers (2)

Gregor Thomas
Gregor Thomas

Reputation: 145775

Use the quantile function and specify the type as 1, 3 or 4.

> quantile(mydates, p = 0.5, type = 1)
         50% 
"2016-03-30" 
> quantile(mydates, p = 0.5, type = 3)
         50% 
"2016-03-30" 
> quantile(mydates, p = 0.5, type = 4)
         50% 
"2016-03-30" 

See ?quantile for details.

Upvotes: 6

James
James

Reputation: 66834

Just trim a value from one end of your sorted data and take the median of that:

#later date
median(sort(mydates)[-1])
[1] "2016-05-31"
#earlier date
median(sort(mydates)[-length(mydates)])
[1] "2016-03-30"

Upvotes: 1

Related Questions