Reputation: 590
I am using R to analyize multiple experiments in which the results are stored in multiple CSV files. I run table()
to tabulate the data and get results like the following
Tabulations of Combination1.csv
A 1000
B 50
C 200
Tabulations of Combination2.csv
A 25
B 1500
D 30
Tabulations of Combination3.csv
B 19
C 500
E 2000
I want to build a table that combines these tabulations.
Combination A B C D E
c1 1000 50 200 N/A N/A
c2 25 1500 N/A 30 N/A
c3 N/A 19 500 N/A 2000
Upvotes: 1
Views: 66
Reputation: 20329
Here's how I would do it using tidyr
and dplyr
:
Data
c1 <- rep(LETTERS[1:3], c(1000, 50, 200))
c2 <- rep(LETTERS[c(1:2, 4)], c(25, 1500, 30))
c3 <- rep(LETTERS[c(2:3, 5)], c(19, 500, 2000))
Code
library(tidyr)
library(plyr)
allC <- list(c1 = c1, c2 = c2, c3 = c3)
# get all tables in data.frame format
ldply(names(allC), function(x) {
tab <- table(allC[[x]])
data.frame(Combination = x, element = names(tab), Freq = c(tab))
}) %>% spread(element, Freq)
# Combination A B C D E
# 1 c1 1000 50 200 NA NA
# 2 c2 25 1500 NA 30 NA
# 3 c3 NA 19 500 NA 2000
Explanation
You transform all your tables to a data.frame
first, where you append the name of the respective element. Then you use spread
to spread out the values.
Upvotes: 1
Reputation: 3029
library(dplyr)
library(tidyr)
x <- table(c(rep("A", 1000), rep("B", 50), rep("C", 200)))
y <- table(c(rep("A", 25), rep("B", 1500), rep("D", 30)))
z <- table(c(rep("B", 19), rep("C", 500), rep("E", 2000)))
X <- data.frame(x) %>% spread(Var1, Freq)
Y <- data.frame(y) %>% spread(Var1, Freq)
Z <- data.frame(z) %>% spread(Var1, Freq)
X %>% full_join(Y) %>% full_join(Z) %>%
mutate(Combination = paste0("c", seq(1,3)))
Result:
> X %>% full_join(Y) %>% full_join(Z) %>%
+ mutate(Combination = paste0("c", seq(1,3)))
Joining, by = c("A", "B")
Joining, by = c("B", "C")
A B C D E Combination
1 1000 50 200 NA NA c1
2 25 1500 NA 30 NA c2
3 NA 19 500 NA 2000 c3
Please think for the next time to provide x
, y
and z
objects for a reproducible example :)
Upvotes: 0