hpy
hpy

Reputation: 2161

R: Dynamically referencing column names when using "$" or "[:]" syntax?

Let's say in R I have a data frame (called df) with a bunch of columns containing integer data named "Var1foo", "Var2foo", and so on.

Now suppose I want to create a new column called sum1 that adds up everything between "Var3foo" and "Var6foo". I might do:

df$sum1 <- rowSums(df[Var3foo:Var6foo])

Or, I might do something a bit more complicated and create a new column called foobar with apply() like so:

eenie = 3
meenie = 2
df$foobar <- apply(df, 1, function(x) if (sum(x[Var2foo:Var7foo]) == eenie & sum(x[1:Var3foo]) != meenie) 1 else 0)

The problem is I always have to explicitly write out the column names or index when referring to those columns. What if I want to refer to column "Varxfoo" where x <- 8 or "Varyfoo" where y <- 12?

What I mean is, I wouldn't be able to do df$paste0("Var", x, "foo") or sum(x[paste0("Var", x, "foo"):paste0("Var", y, "foo")]).

I also considered using dplyr::mutate() to create df$sum1 and df$foobar but it seems to also need explicit column (variable) names.

What should I do? Thanks!!

Upvotes: 0

Views: 671

Answers (2)

Consistency
Consistency

Reputation: 2922

Maybe you could refer the column with

df[paste0("Var", x, "foo")]

If you keep using such things a lot, you could use some function to reduce your work,

int2name <- function(x, prefix = "", suffix = ""){
    paste0(prefix, x, suffix)
}

And then you can use:

df[int2name(2:4, prefix = "Var", suffix = "foo")]

Upvotes: 1

Eldioo
Eldioo

Reputation: 522

A simple solution would be directly referencing the columns, with

sum(df[,x:y])

Of course this only works if the columns are in order.

Upvotes: 1

Related Questions