Reputation: 798
I have two issues:
Error in `$.default`(dat, "var") : $ operator is invalid for atomic vectors
)attach(dat)
but it didn't work)I'm trying to create a function that lets an user input a dataset and a column name.
Here's the function I'm trying to create'
fre <- function(dat, var) {
abc <- questionr::na.rm(dat$var)
abc <- questionr::freq(abc)
abc <- cbind(Label = rownames(abc), abc)
abc <- questionr::rename.variable(abc, "n", "Frequency")
abc <- questionr::rename.variable(abc, "%", "Percent")
abc <- tidyr::separate(abc, Label, into = c("Value", "Label"), sep = "] ")
row.names(abc) <- NULL
abc <- abc %>% dplyr::mutate(Value = gsub("[[:punct:]]", '', Value)) %>% dplyr::select(Label, Value, Frequency, Percent)
abc
}
Reproducible example
library(haven)
#install.packages("questionr")
library(questionr)
library(dplyr)
library(tidyr)
# Load data
dat <- read_sav(url("http://staff.bath.ac.uk/pssiw/stats2/SAQ.sav"))
abc <- questionr::na.rm(dat$Q01)
abc <- questionr::freq(abc)
abc <- cbind(Label = rownames(abc), abc)
abc <- questionr::rename.variable(abc, "n", "Frequency")
abc <- questionr::rename.variable(abc, "%", "Percent")
abc <- tidyr::separate(abc, Label, into = c("Value", "Label"), sep = "] ")
row.names(abc) <- NULL
abc <- abc %>% dplyr::mutate(Value = gsub("[[:punct:]]", '', Value)) %>% dplyr::select(Label, Value, Frequency, Percent)
abc
In the end, my output from the above code looks like this:
I'm trying to get this by using this function:
fre(dat, Q01)
but I'm getting this error:
Error in `$.default`(dat, "var") : $ operator is invalid for atomic vectors
How should I pass the column name in this function for it to work?
I tried var <- enquo(var)
but it didn't work.
For the second issue, I've tried using attach(dat)
before calling a function, but it didn't work. Ideally, I would like to make the fre
function work and then eventually use it without passing the data argument.
Upvotes: 0
Views: 66
Reputation: 10365
You actually don't need too much black magic here. I've made 2 versions of the function.
fre_pipe
needs the data as an input argument, but it can be used with the pipefre_free
relies on an object called global_dat
that has to be defined in the calling environmentYou don't need enquo
here, because you don't need to capture the environment of your variable. ensym
is enough (it ensures that your var
is treated as a symbol and is not executed). In the second step, you can use as_string
to convert it to a string. For further reading see the metaprogramming chapter in advanced R.
library(haven)
library(questionr)
library(dplyr)
library(tidyr)
# Load data
dat <- read_sav(url("http://staff.bath.ac.uk/pssiw/stats2/SAQ.sav"))
fre_pipe <- function(.data, var) {
var <- rlang::ensym(var)
abc <- questionr::na.rm(.data[, rlang::as_string(var)])
abc <- questionr::freq(abc)
abc <- cbind(Label = rownames(abc), abc)
abc <- questionr::rename.variable(abc, "n", "Frequency")
abc <- questionr::rename.variable(abc, "%", "Percent")
abc <- tidyr::separate(abc, Label, into = c("Value", "Label"), sep = "] ")
row.names(abc) <- NULL
abc <- abc %>% dplyr::mutate(Value = gsub("[[:punct:]]", '', Value)) %>% dplyr::select(Label, Value, Frequency, Percent)
abc
}
dat %>% fre_pipe(Q01)
#> Label Value Frequency Percent
#> 1 Strongly agree 1 270 10.5
#> 2 Agree 2 1338 52.0
#> 3 Neither 3 735 28.6
#> 4 Disagree 4 187 7.3
#> 5 Strongly disagree 5 41 1.6
#> 6 Not answered 9 0 0.0
fre_free <- function(var) {
var <- rlang::ensym(var)
abc <- questionr::na.rm(global_dat[, rlang::as_string(var)])
abc <- questionr::freq(abc)
abc <- cbind(Label = rownames(abc), abc)
abc <- questionr::rename.variable(abc, "n", "Frequency")
abc <- questionr::rename.variable(abc, "%", "Percent")
abc <- tidyr::separate(abc, Label, into = c("Value", "Label"), sep = "] ")
row.names(abc) <- NULL
abc <- abc %>% dplyr::mutate(Value = gsub("[[:punct:]]", '', Value)) %>% dplyr::select(Label, Value, Frequency, Percent)
abc
}
global_dat <- dat
fre_free(Q01)
#> Label Value Frequency Percent
#> 1 Strongly agree 1 270 10.5
#> 2 Agree 2 1338 52.0
#> 3 Neither 3 735 28.6
#> 4 Disagree 4 187 7.3
#> 5 Strongly disagree 5 41 1.6
#> 6 Not answered 9 0 0.0
Created on 2020-09-05 by the reprex package (v0.3.0)
I don't think that fre_free
without the data argument is good style. If you're tired of always repeating the argument, maybe you want to apply your function repeatedly with lapply
or map
? Something like:
vector_with_column_names %>%
purrr::walk(~print(fre(dat = dat, var = .x)))
(But here the normal c(Q01, Q02)
would not work and you would either need to make a function to create vectors of symbols or use the column names.)
Upvotes: 1
Reputation: 798
Based on r2evans comments, this worked :
fre <- function(dat, var) {
abc <- questionr::na.rm(dat[[var]])
abc <- questionr::freq(abc)
abc <- cbind(Label = rownames(abc), abc)
abc <- questionr::rename.variable(abc, "n", "Frequency")
abc <- questionr::rename.variable(abc, "%", "Percent")
abc <- tidyr::separate(abc, Label, into = c("Value", "Label"), sep = "] ")
row.names(abc) <- NULL
abc <- abc %>% dplyr::mutate(Value = gsub("[[:punct:]]", '', Value)) %>% dplyr::select(Label, Value, Frequency, Percent)
abc
}
fre(dat, "Q01")
But, I'm still looking for a way to not pass the data argument each time. And it would be a bonus to find a way to not use ""
.
Upvotes: 0