Mobeus Zoom
Mobeus Zoom

Reputation: 608

Expand grid to generate new datasets in R

I have a data-frame and want to produce a set of new datasets from it using combinations of transformations from group X and group Y:

#Group X
df1 <- df+1
df2 <- df-2
df3 <- df*3

#Group Y
df4 <- df*4
df5 <- df^5
df6 <- df/6

What I'd really like to do is use expand.grid and then generate a new dataset with every combination of Group X and Group Y transformations. (Group Y is always applied after Group X though.) These datasets should be stored in the global environment.

So the output would be the same as the result of

df14 <- (df+1)*4
df24 <- (df-2)*4
df34 <- (df*3)*4
df15 <- (df+1)^5
df25 <- (df-2)^5
df35 <- (df*3)^5
df16 <- (df+1)/6
df26 <- (df-2)/6
df36 <- (df*3)/6

How would I do this? (For example data you could take literally any dataframe, e.g. iris.)

You can rewrite the Group X and Y transformations as functions if that helps:

#Group X
Fun1 <- function(x){return(x+1)}
Fun2 <- function(x){return(x-2)}
Fun3 <- function(x){return(x*3)}

#Group Y
Fun4 <- function(x){return(x*4)}
Fun5 <- function(x){return(x^5)}
Fun6 <- function(x){return(x/6)}

I guess for the names of the datasets, something like df.Fun1.Fun4 would be good. (Note df of course should change depending on the name of the data-frame I supply, so here it'd be iris.Fun1.Fun4)

Upvotes: 0

Views: 70

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145755

Write a function that does what you want:

foo = function(add, mult) {
  (df + add) * mult
}

Then use expand.grid on your desired values for add and mult and iterate over that. Map is a nice way in base to iterate over multiple values.

params = expand.grid(add = 1:3, mult = 4:6)

df = iris[1:6, 1:3] # numeric sample from iris

result = with(params, Map(foo, add = add, mult = mult))
names(result) = with(params, paste0("add ", add, ", mult ", mult))
result
# $`add 1, mult 4`
#   Sepal.Length Sepal.Width Petal.Length
# 1         24.4        18.0          9.6
# 2         23.6        16.0          9.6
# 3         22.8        16.8          9.2
# 4         22.4        16.4         10.0
# 5         24.0        18.4          9.6
# 6         25.6        19.6         10.8
# 
# $`add 2, mult 4`
#   Sepal.Length Sepal.Width Petal.Length
# 1         28.4        22.0         13.6
# 2         27.6        20.0         13.6
# 3         26.8        20.8         13.2
# 4         26.4        20.4         14.0
# 5         28.0        22.4         13.6
# 6         29.6        23.6         14.8
# 
# $`add 3, mult 4`
#   Sepal.Length Sepal.Width Petal.Length
# 1         32.4        26.0         17.6
# 2         31.6        24.0         17.6
# ...

Adapted for functions rather than specific parameters:

#Group X
Fun1 <- function(x){return(x+1)}
Fun2 <- function(x){return(x-2)}
Fun3 <- function(x){return(x*3)}

#Group Y
Fun4 <- function(x){return(x*4)}
Fun5 <- function(x){return(x^5)}
Fun6 <- function(x){return(x/6)}

# Put the functions in a list
funs_x = mget(ls(pattern = "Fun[1-3]"))
funs_y = mget(ls(pattern = "Fun[4-6]"))

# iterate over list indices
indices = expand.grid(ind_x = seq_along(funs_x), ind_y = seq_along(funs_y))
result = with(indices, Map(function(ind_x, ind_y) funs_y[[ind_y]](funs_x[[ind_x]](df)), ind_x, ind_y))
names(result) = with(indices, paste("df", names(funs_x)[ind_x], names(funs_y)[ind_y], sep = "."))
result
# $df.Fun1.Fun4
#   Sepal.Length Sepal.Width Petal.Length
# 1         24.4        18.0          9.6
# 2         23.6        16.0          9.6
# 3         22.8        16.8          9.2
# 4         22.4        16.4         10.0
# 5         24.0        18.4          9.6
# 6         25.6        19.6         10.8
# 
# $df.Fun2.Fun4
#   Sepal.Length Sepal.Width Petal.Length
# 1         12.4         6.0         -2.4
# 2         11.6         4.0         -2.4
# 3         10.8         4.8         -2.8
# 4         10.4         4.4         -2.0
# 5         12.0         6.4         -2.4
# 6         13.6         7.6         -1.2
# 
# $df.Fun3.Fun4
#   Sepal.Length Sepal.Width Petal.Length
# 1         61.2        42.0         16.8
# 2         58.8        36.0         16.8
# ...

Upvotes: 3

Related Questions