carl whyte
carl whyte

Reputation: 103

apply function in vectorization of a vector inside another vector

year
1999 1999 1999 2003 2003 2005 2005 2005 2005 2007 2009 2009 2009

A1
15 7 24 6 65 5 89 56 21 15 19 7 23

Above table shows a data frame. I want to have a vector, lets say "median1" which has the median of those values in a1 corresponds to each year. And I know that with a for loop it is easy but I am trying to find a 'vectorized' based solution.

Upvotes: 1

Views: 64

Answers (4)

Rich Scriven
Rich Scriven

Reputation: 99331

It's not clear to me, but it seems like you want the median of each year? If so...

## set up the data
> year <- c(1999,1999,1999,2003,2003,2005,2005,2005,2005,2007,2009,2009,2009)
> A1 <- c(15, 7, 24, 6, 65, 5, 89, 56, 21, 15, 19, 7, 23)
> dd <- data.frame(year, A1)

## solution
> xx <- c(do.call(cbind, lapply(split(dd, dd$year), function(x) median(x$A1))))
> names(xx) <- unique(dd$year)
> xx
1999 2003 2005 2007 2009 
15.0 35.5 38.5 15.0 19.0 

Upvotes: 0

Stephan Kolassa
Stephan Kolassa

Reputation: 8267

In base R, you can do this:

foo <- data.frame(
  year=c(1999,1999,1999,2003,2003,2005,2005,2005,2005,2007,2009,2009,2009),
  A1=c(15,7,24,6,65,5,89,56,21,15,19,7,23))
by(foo$A1,foo$year,median)

Strictly speaking, the result will not be a vector, but you can fix that:

as.vector(by(foo$A1,foo$year,median))

by() is always helpful when you want to do an operation by groups.

Upvotes: 1

Jilber Urbina
Jilber Urbina

Reputation: 61154

Use ave which is an R base function. Combining ave with transform you'll get a pretty nice output. Consider dat is your data.frame

> transform(dat, Median= ave(a1, year, FUN=median))
  year a1 Median
1 1999 20   15.0
2 1999 15   15.0
3 1999 11   15.0
4 2003 11    7.0
5 2003  3    7.0
6 2007 89   40.5
7 2007 25   40.5
8 2007 56   40.5
9 2007 12   40.5

If you only want a vector consisting of medians per each year you can do:

> with(dat, ave(a1, year, FUN=median))
[1] 15.0 15.0 15.0  7.0  7.0 40.5 40.5 40.5 40.5

Upvotes: 1

statquant
statquant

Reputation: 14370

with data.table package if your data.frame is called DF

library(data.table)
DT = data.table(DF)
DT[,median(a1),by='year']

Upvotes: 1

Related Questions