d12n
d12n

Reputation: 861

Get average value for only columns that fit a specific criteria

I have a dataframe that looks like this:

id                              weekdays              halflife
241732222300860000  Friday, Aug 31, 2012, 22    0.4166666667
241689170123309000  Friday, Aug 31, 2012, 19    0.3833333333
241686878137512000  Friday, Aug 31, 2012, 19    0.4
241651117396738000  Friday, Aug 31, 2012, 16    1.5666666667
241635163505820000  Friday, Aug 31, 2012, 15    0.95
241633401382265000  Friday, Aug 31, 2012, 15    2.3666666667

And I would like to get average half life of items that were created on Monday, then on Tuesday...etc. (My date range spans over 6 months). Please let me know how I can give reproducible code, because I couldn't figure out a way to attach files.

To get the date values I used strptime and difftime. Also, I found the maximum halflife with max(df$halflife), how can I find which id it corresponds to?

Reproducible code:

structure(list(id = c(241732222300860416, 241689170123309056, 
241686878137511936, 241651117396738048, 241635163505819648, 241633401382264832
), weekdays = c("Friday, Aug 31, 2012, 22", "Friday, Aug 31, 2012, 19", 
"Friday, Aug 31, 2012, 19", "Friday, Aug 31, 2012, 16", "Friday, Aug 31, 2012, 15", 
"Friday, Aug 31, 2012, 15"), halflife = structure(c(0.416666666666667, 
0.383333333333333, 0.4, 1.56666666666667, 0.95, 2.36666666666667
), class = "difftime", units = "mins")), .Names = c("id", 
"weekdays", "halflife"), row.names = c(NA, 6L), class = "data.frame")

Upvotes: 1

Views: 820

Answers (1)

juba
juba

Reputation: 49033

There may be a better way to get the week days, but you can use tapply like this (here df is the name of your data frame) :

days <- sub(",.*$", "", df$weekdays)
tapply(df$halflife, days, mean)

And to get the ids of your maxium values, use which :

df$id[which(df$halflife==max(df$halflife))]

Upvotes: 4

Related Questions