Reputation: 633
Out of the dataframe Question
, with columns Question$Temperature
, Question$Salary
, I want to select only the Salary
's with Temperature
higher than 10. I always do the following:
Question[Question$Temperature>10]$Salary
Is there a cleaner way?
Upvotes: 0
Views: 135
Reputation: 3711
three common ways with benchmarking
l<-data.frame(x=sample(1:10,1000, replace=T), y=runif(1000))
f1<-function(df){l2=df[df$x>8,"y"]}
f2<-function(df){l2=df[df$x>8,]$y}
f3<-function(df){l2=df$y[df$x>8]}
print(microbenchmark(f1(l), f2(l), f3(l), times=1000))
result
Unit: microseconds
expr min lq median uq max neval
f1(l) 97.428 101.378 102.696 107.962 3757.555 1000
f2(l) 247.081 253.226 257.614 270.780 734.659 1000
f3(l) 59.686 62.319 63.197 64.514 3793.980 1000
Upvotes: 1
Reputation: 81693
It's more efficient to use
Question$Salary[Question$Temperature > 10]
since you do not subset a whole data frame but the values of a vector,
Upvotes: 1