Operation on multiple column in a data frame based on group by

Question

I have below dataframe and would like get median prices from 2003 till 2011 by state.

state freq   2003   2004   2005   2006   2007   2008   2009   2010   2011
MS    2     83000  88300  87000  94400  94400  94400  94400  94400  94400
MS    2     97000  98000 110200 115700 115700 115700 115700 115700 115700
LA    2     154300 164600 181300 149200 149200 149200 149200 149200 149200
LA    2     126800 139200 157100 144500 144500 144500 144500 144500 144500

I am still learning so any help would be appreciated. I was thinking i can use sqldf on the data frame.

nothing · Accepted Answer

If I'm understanding your goal correctly, you're looking for the aggregate() function, which applies a function to all columns of a data.frame by a grouping variable.

aggregate(yourDf[ ,-(1:2)], by = list(yourDf$state), FUN = median)

Operation on multiple column in a data frame based on group by

Answers (2)

Benchmarks

Related Questions