Reputation: 77
I'm relatively new to R and was struggling with potentially a very simple problem.
I have data that has multiple columns named in a similar way. Here is a sample data:
df = data.frame(PPID = 1:50,
time1 = sample(c(0,1), 50, replace = TRUE),
time2 = sample(c(0,1), 50, replace = TRUE),
time3 = sample(c(0,1), 50, replace = TRUE),
condition1 = sample(c(0:3), 50, replace = TRUE),
condition2 = sample(c(0:3), 50, replace = TRUE))
In my actual data, I have much more columns - approximately 50 for time and 10 for condition.
I want to multiply week columns and condition columns, e.g. in that sample data it should give me 6 extra columns, like: time1_condition1, time1_condition2, time2_condition1, time2_condition2, time3_condition1, time3_condition2.
I tried solutions that were suggested in this thread but they did not work (presumably because I didn't understand how mapply/apply worked and did not make appropriate changes) - it gave me error message that the longer argument is not a multiple of length of shorter.
Any help would be greatly appreciated!
Upvotes: 1
Views: 107
Reputation: 50738
Here is a tidyverse
alternative
library(tidyverse)
idx.time <- grep("time", names(df), value = T)
idx.cond <- grep("condition", names(df), value = T)
bind_cols(
df,
map_dfc(transpose(expand.grid(idx.time, idx.cond, stringsAsFactors = F)),
~setNames(data.frame(df[, .x$Var1] * df[, .x$Var2]), paste(.x$Var1, .x$Var2, sep = "_"))))
# PPID time1 time2 time3 condition1 condition2 time1_condition1
#1 1 1 0 1 3 0 3
#2 2 0 1 1 0 1 0
#3 3 0 1 1 0 2 0
#4 4 0 0 1 0 3 0
#5 5 0 0 0 0 3 0
#...
Explanation: expand.grid
creates all pairwise combinations of idx.time
and idx.cond
. transpose
turns a list/data.frame inside-out and returns a list
, similar to apply(..., 1, as.list)
; map_dfc
then operates on every element of that list
and column-binds results.
Upvotes: 2
Reputation: 79348
Using
library(tidyverse)
a = df[grep("time",names(df))]
b = df[grep("condition",names(df))]
we can do:
map(a,~.x*b)%>%
bind_cols()%>%
set_names(paste(rep(names(a),each=ncol(b)),names(b),sep="_"))
or we can
cross2(a,b)%>%
map(lift(`*`))%>%
set_names(paste(rep(names(a),each=ncol(b)),names(b),sep="_"))%>%
data.frame()
time1_condition1 time2_condition1 time3_condition1 time1_condition2 time2_condition2 time3_condition2
1 3 0 3 2 0 2
2 3 3 0 1 1 0
3 0 0 0 0 0 0
4 3 3 0 0 0 0
5 0 0 2 0 0 1
6 0 0 1 0 0 1
7 2 2 0 0 0 0
Upvotes: 1
Reputation: 389325
#Get all the columns with "time" columns
time_cols <- grep("^time", names(df))
#Get all the columns with "condition" column
condition_cols <- grep("^condition", names(df))
#Multiply each "time" columns with all the condition columns
# and creating a new dataframe
new_df <- do.call("cbind", lapply(df[time_cols] , function(x) x *
df[condition_cols]))
#Combine both the dataframes
complete_df <- cbind(df,new_df)
We can also generate column names using expand.grid
new_names <- do.call("paste0",
expand.grid(names(df)[condition_cols], names(df)[time_cols]))
colnames(complete_df)[7:12] <- new_names
Upvotes: 2