Data Management in R

Question

So I have this code where I am trying to unite separate columns called grade prek-12 into one column called Grade. I have employed the tidyr package and used this line of code to perform said task:

unite(dta, "Grade",
          c(Gradeprek, 
            dta$Gradek, dta$Grade1, dta$Grade2,
            dta$Grade3, dta$Grade4, dta$Grade5,
            dta$Grade6, dta$Grade7, dta$Grade8,
            dta$Grade9, dta$Grade10, dta$Grade11,
            dta$Grade12),
      sep="")

However, I have been getting an error saying this:

error: All select() inputs must resolve to integer column positions. The following do not: * c(Gradeprek, dta$Gradek, dta$Grade1, dta$Grade2, dta$Grade3, dta$Grade4, dta$Grade5, dta$Grade6, ...

Penny for your thoughts on how I can resolve the situation.

Gregor Thomas · Accepted Answer

You are mixing and matching the two syntax options for unite and unite_ - you need to pick one and stick with it. In both cases, do not use data$column - they take a data argument so you don't need to re-specify which data frame your columns come from.

Option 1: NSE The default non-standard evaluation means bare column names - no quotes! And no c().

unite(dta, Grade, Gradeprek, Gradek, Grade1, Grade2, Grade3, ..., 
    Grade12, sep = "")

There are tricks you can do with this. For example, if all your Grade columns are in this order next to each other in your data frame, you could do

unite(dta, Grade, Gradeprek:Grade12, sep = "")

You could also use starts_with("Grade") to get all column that begin with that string. See ?unite and its link to ?select for more details.

Option 2: Standard Evaluation You can use unite_() for a standard-evaluating alternative which will expect column names in a character vector. This has the advantage in this case of letting you use paste() to build column names in the order you want:

unite_(dta, col = "Grade", c("Gradeprek", "Gradek", paste0("Grade", 1:12)), sep = "")

Data Management in R

Answers (1)

Related Questions