Reputation: 1
I am analyzing data in R with many different data sets and I want to send in dummy variables to a function which then subsets the main data set and outputs the mean of a variable in the subsets.
For instance, my data set is named "two" and my dummy variable is "over50" and my function is:
getMean <- function(varName) {
sub1 <- two[two$varName == 1, ]
sub2 <- two[two$varName == 0, ]
print(mean(sub1$return)
print(mean(sub2$return)
}
However, when I call getMean(over50)
I do not get the answer expected.
Is there a way to translate the function input into var names so I can do this dynamically? Or do I have to manually do these calculations?
Upvotes: 0
Views: 93
Reputation: 38510
It is easier in this instance to pass a string to your function. Here is a generalized function that takes a data.frame and a variable name (string).
getMean <- function(df, varName) {
mean1 <- mean(df[df[[varName]] == 1, ]$return)
mean2 <- mean(df[df[[varName]] == 1, ]$return)
return(c("mean1"=mean1, "mean2"=mean2))
}
This returns a named vector with the two means. the df argument must be a data.frame name (without quotes) whereas the varname should be a character string.
Upvotes: 0
Reputation: 111
I think the syntax you want is two[two[, varName] == 0, ]
.
More generally, you can access columns/rows of a data frame by passing in a string as data[c("row1", "row2"), c("col1", "col2")]
.
Side note: I think you're also missing a couple closing parentheses in your print()
statements.
Upvotes: 1