tomas hujo
tomas hujo

Reputation: 153

Looping over all possible column combinations with some contstraints

xx is a sample data. it contains variables dep1,dep2,dep3,bet1,bet2,bet3. I want to select all possible 2 columns combinations but not the ones with the same name (except,number) . In this examples there are 9 such combos {dep1:bet1,dep1:bet2,dep1:bet3,dep2:bet1...........}

Below is code which I want to run for all combinations ( I did it just for one) also in last line I added a code to keep track which variables were included in calculations. I believe the regex will help to understand. help appreciated !

xx<-data.frame(id=1:10,
               category=c(rep("A",5),rep("B",5)),
               dep1=sample(1:5,10,replace = T),
               dep2=sample(1:5,10,replace = T),
               dep3=sample(1:5,10,replace = T),
               bet1=sample(1:5,10,replace = T),
               bet2=sample(1:5,10,replace = T),
               bet3=sample(1:5,10,replace = T))

xx%>%select(2,dep1,bet1)%>%
  mutate(vdep=if_else(dep1>3,1,0),
        vbet=if_else(bet1>3,1,0))%>%
  group_by(category)%>%
  summarise(vdep=mean(vdep),
            vbet=mean(vbet))%>%ungroup()%>%
  gather(variable,value,-category)%>%
  mutate(variable=as.factor(variable))%>%
  unite(variable,category,col = "new")%>%
  spread(new,value)%>%
  mutate(first="dep1",second="bet1")

Upvotes: 0

Views: 36

Answers (1)

Liman
Liman

Reputation: 1300

If I understand you correctly, something like the following should do it:

# the data 
xx<-data.frame(id=1:10,
               category=c(rep("A",5),rep("B",5)),
               dep1=sample(1:5,10,replace = T),
               dep2=sample(1:5,10,replace = T),
               dep3=sample(1:5,10,replace = T),
               bet1=sample(1:5,10,replace = T),
               bet2=sample(1:5,10,replace = T),
               bet3=sample(1:5,10,replace = T))

# Getting the column names with "dep" or "bet"
cols = names(xx)[grepl("dep|bet", names(xx))]
deps = cols[grepl("dep", cols)]
bets = cols[grepl("bet", cols)]

# Getting all possible combinations of these columns
comb = expand.grid(deps, bets)
comb

#   Var1 Var2
# 1 dep1 bet1
# 2 dep2 bet1
# 3 dep3 bet1
# 4 dep1 bet2
# 5 dep2 bet2
# 6 dep3 bet2
# 7 dep1 bet3
# 8 dep2 bet3
# 9 dep3 bet3

# Transposing the dataframe containing these combinations, so that
# we can directly use sapply / lapply on the columns latter
comb = data.frame(t(comb), stringsAsFactors = FALSE)

# For each combination, subset the dataframe xx
result = sapply(comb, function(x){
  xx[, x]
}, simplify = FALSE)

result

# $X1
#     dep1 bet1
# 1     1    5
# 2     1    5
# 3     2    2
# 4     2    2
# 5     1    5
# 6     3    3
# 7     1    1
# 8     2    2
# 9     3    2
# 10    1    5
# 
# $X2
#     dep2 bet1
# 1     1    5
# 2     2    5
# 3     4    2
# 4     5    2
# 5     1    5
# 6     5    3
# 7     2    1
# 8     1    2
# 9     4    2
# 10    4    5
# 
# $X3
#     dep3 bet1
# 1     3    5
# 2     2    5
# 3     4    2
# 4     3    2
# 5     3    5
# 6     2    3
# 7     1    1
# 8     4    2
# 9     5    2
# 10    5    5
# 
# $X4
#     dep1 bet2
# 1     1    5
# 2     1    1
# 3     2    1
# 4     2    2
# 5     1    2
# 6     3    2
# 7     1    3
# 8     2    3
# 9     3    5
# 10    1    1
# 
# $X5
#     dep2 bet2
# 1     1    5
# 2     2    1
# 3     4    1
# 4     5    2
# 5     1    2
# 6     5    2
# 7     2    3
# 8     1    3
# 9     4    5
# 10    4    1
# 
# $X6
#      dep3 bet2
# 1     3    5
# 2     2    1
# 3     4    1
# 4     3    2
# 5     3    2
# 6     2    2
# 7     1    3
# 8     4    3
# 9     5    5
# 10    5    1
# 
# $X7
#      dep1 bet3
# 1     1    3
# 2     1    2
# 3     2    5
# 4     2    1
# 5     1    3
# 6     3    2
# 7     1    4
# 8     2    1
# 9     3    1
# 10    1    3
# 
# $X8
#     dep2 bet3
# 1     1    3
# 2     2    2
# 3     4    5
# 4     5    1
# 5     1    3
# 6     5    2
# 7     2    4
# 8     1    1
# 9     4    1
# 10    4    3
# 
# $X9
#     dep3 bet3
# 1     3    3
# 2     2    2
# 3     4    5
# 4     3    1
# 5     3    3
# 6     2    2
# 7     1    4
# 8     4    1
# 9     5    1
# 10    5    3

Upvotes: 2

Related Questions