Wanderer
Wanderer

Reputation: 590

R Merge Tabulations of data from different files

I am using R to analyize multiple experiments in which the results are stored in multiple CSV files. I run table() to tabulate the data and get results like the following

Tabulations of Combination1.csv
A  1000
B  50
C  200            
Tabulations of Combination2.csv
A 25
B 1500
D 30
Tabulations of Combination3.csv
B 19
C 500
E 2000

I want to build a table that combines these tabulations.

Combination A     B     C     D    E
c1          1000   50    200   N/A  N/A
c2          25    1500   N/A   30   N/A
c3          N/A    19    500   N/A  2000    

Upvotes: 1

Views: 66

Answers (2)

thothal
thothal

Reputation: 20329

Here's how I would do it using tidyr and dplyr:

Data

c1 <- rep(LETTERS[1:3], c(1000, 50, 200))
c2 <- rep(LETTERS[c(1:2, 4)], c(25, 1500, 30))
c3 <- rep(LETTERS[c(2:3, 5)], c(19, 500, 2000))

Code

library(tidyr)
library(plyr)
allC <- list(c1 = c1, c2 = c2, c3 = c3)
# get all tables in data.frame format
ldply(names(allC), function(x) {
   tab <- table(allC[[x]]) 
   data.frame(Combination = x, element = names(tab), Freq = c(tab))
}) %>% spread(element, Freq)

#   Combination    A    B   C  D    E
# 1          c1 1000   50 200 NA   NA
# 2          c2   25 1500  NA 30   NA
# 3          c3   NA   19 500 NA 2000

Explanation

You transform all your tables to a data.frame first, where you append the name of the respective element. Then you use spread to spread out the values.

Upvotes: 1

Costin
Costin

Reputation: 3029

library(dplyr)
library(tidyr)

x <- table(c(rep("A", 1000), rep("B", 50), rep("C", 200)))
y <- table(c(rep("A", 25), rep("B", 1500), rep("D", 30)))
z <- table(c(rep("B", 19), rep("C", 500), rep("E", 2000)))

X <- data.frame(x) %>% spread(Var1, Freq)
Y <- data.frame(y) %>% spread(Var1, Freq)
Z <- data.frame(z) %>% spread(Var1, Freq)

X %>% full_join(Y) %>% full_join(Z) %>%
  mutate(Combination = paste0("c", seq(1,3)))

Result:

> X %>% full_join(Y) %>% full_join(Z) %>%
+ mutate(Combination = paste0("c", seq(1,3)))
Joining, by = c("A", "B")
Joining, by = c("B", "C")
     A    B   C  D    E Combination
1 1000   50 200 NA   NA          c1
2   25 1500  NA 30   NA          c2
3   NA   19 500 NA 2000          c3

Please think for the next time to provide x, y and z objects for a reproducible example :)

Upvotes: 0

Related Questions