thehand0
thehand0

Reputation: 1163

All possible combinations of variable R

I have a list of variables and values for several subjects in a data frame (see below). I am using R/R Studio and would like to reformat the data to return every possible combination of variables, as well as their corresponding values for each subject. The actual dataset is much larger than this, so any way of doing it programmatically would be greatly appreciated.

Thankyou

My current data

Variable Subject Value
  V1       A      1
  V1       B      0
  V1       C      1
  V2       A      0
  V2       B      1
  V2       C      0
  V3       A      1
  V3       B      1
  V3       C      1

My desired output:

Subject Variable 1     Variable 2      Value 1      Value 2
   A        V1              V2            1            0
   A        V1              V3            1            1
   A        V2              V3            0            1   
   B        V1              V2            0            1
   B        V1              V3            0            1
   B        V2              V3            1            1
   C        V1              V2            1            0
   C        V1              V3            1            1
   C        V2              V3            0            1

Upvotes: 1

Views: 1144

Answers (2)

AnilGoyal
AnilGoyal

Reputation: 26218

A base R solution, in R version 4.1.0 using native pipes

df <- read.table(text = 'Variable Subject Value
  V1       A      1
  V1       B      0
  V1       C      1
  V2       A      0
  V2       B      1
  V2       C      0
  V3       A      1
  V3       B      1
  V3       C      1', header = T)


t(combn(unique(df$Variable), 2)) |> as.data.frame() |>
  setNames(c('Variable1', 'Variable2')) |>
  merge(df, by.x = 'Variable1', by.y = 'Variable') |>
  merge(df, by.x = c('Variable2', 'Subject'), by.y = c('Variable', 'Subject'), suffixes = c('1', '2'))

#>   Variable2 Subject Variable1 Value1 Value2
#> 1        V2       A        V1      1      0
#> 2        V2       B        V1      0      1
#> 3        V2       C        V1      1      0
#> 4        V3       A        V2      0      1
#> 5        V3       A        V1      1      1
#> 6        V3       B        V1      0      1
#> 7        V3       B        V2      1      1
#> 8        V3       C        V1      1      1
#> 9        V3       C        V2      0      1

check it with desired

t(combn(unique(df$Variable), 2)) |> as.data.frame() |>
  setNames(c('Variable1', 'Variable2')) |>
  merge(df, by.x = 'Variable1', by.y = 'Variable') |>
  merge(df, by.x = c('Variable2', 'Subject'), by.y = c('Variable', 'Subject'), suffixes = c('1', '2')) -> df_new

df_new[order(df_new$Subject),c(2,3,1,4,5)]  
#>   Subject Variable1 Variable2 Value1 Value2
#> 1       A        V1        V2      1      0
#> 4       A        V2        V3      0      1
#> 5       A        V1        V3      1      1
#> 2       B        V1        V2      0      1
#> 6       B        V1        V3      0      1
#> 7       B        V2        V3      1      1
#> 3       C        V1        V2      1      0
#> 8       C        V1        V3      1      1
#> 9       C        V2        V3      0      1

Created on 2021-05-19 by the reprex package (v2.0.0)

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388817

You can use combn to create all possible combinations of Variable and Value columns for each Subject.

library(dplyr)
library(tidyr)

create_data_frame <- function(var, val) {
  tab1 <- setNames(data.frame(t(combn(var, 2))), paste0('Variable', 1:2))
  tab2 <- setNames(data.frame(t(combn(val, 2))), paste0('Value', 1:2))
  cbind(tab1, tab2)
}

df %>%
  group_by(Subject) %>%
  summarise(data = list(create_data_frame(Variable, Value))) %>%
  unnest(col = data)

#  Subject Variable1 Variable2 Value1 Value2
#  <chr>   <chr>     <chr>      <int>  <int>
#1 A       V1        V2             1      0
#2 A       V1        V3             1      1
#3 A       V2        V3             0      1
#4 B       V1        V2             0      1
#5 B       V1        V3             0      1
#6 B       V2        V3             1      1
#7 C       V1        V2             1      0
#8 C       V1        V3             1      1
#9 C       V2        V3             0      1

Upvotes: 3

Related Questions