Reputation: 21
I am trying to split my replicate from my results in my dataset in an efficient way. I used this data as an example:
x <- data.frame(sample = c("AA", "AA", "BB", "BB", "CC", "CC"),
Gene = c("HSA-let1","HSA-let1","HSA-let1","HSA-let1","HSA-let1","HSA-let1"),
Cq = c(14.55, 14.45, 13.55, 13.45, 16.55, 16.45))
The problem is that the two duplicates have the same name in "Sample" and "Gene". so when I tried:
spread(x,Gene,Cq)
I get duplicate identifiers error.I have tried this fix code below and it gives two values in one coloumn separated by ",". This was almost successful, but I want them separated:
x_test <- dcast(setDT(x), Gene ~ sample, value.var = 'Cq',
fun.aggregate = function(x) toString(unique(x)))
I did also tried this this tidyr solution, but I dont understand enough R to make it work.
x_test2 <- x %>%
gather(variable, value, -(Gene:Cq)) %>%
unite(temp, Cq, variable) %>%
spread(temp, value)
I want my dataset to look like this:
# Gene AA_1 AA_2 BB_1 BB_2 CC_1 CC_2
# HSA-let 14.55 14.45 13.55 13.45 16.55 16.45
Upvotes: 2
Views: 119
Reputation: 2612
You can change the sample
column:
library(data.table)
setDT(x)[, sample := paste(sample, ifelse(!duplicated(sample), '1', '2'), sep = '_')]
dcast(x, ...~sample, value.var = 'Cq')
# Gene AA_1 AA_2 BB_1 BB_2 CC_1 CC_2
# 1: HSA-let1 14.55 14.45 13.55 13.45 16.55 16.45
Note: spread
should be called as spread(x, sample, Cq)
.
If you have diferent number of repeated values (not always 2), you can do:
x <- setDT(x)[order(sample),]
x[, sample := paste(sample, unlist(lapply(table(x$sample), function(x) 1:x)), sep = '_')]
dcast(x, ...~sample, value.var = 'Cq')
Beware that x
should be sorted by sample
.
Upvotes: 1
Reputation: 56159
Make the samples unique, then spread:
x %>%
group_by(sample) %>%
mutate(rn = row_number()) %>%
ungroup() %>%
mutate(sample = paste(sample, rn, sep = "_")) %>%
select(-rn) %>%
spread(key = sample, value = Cq)
# # A tibble: 1 x 7
# Gene AA_1 AA_2 BB_1 BB_2 CC_1 CC_2
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 HSA-let1 14.6 14.4 13.6 13.4 16.6 16.4
Upvotes: 2
Reputation: 301
you can try this
library(dplyr)
x %>% group_by(Gene) %>%
mutate(sample = paste(sample, seq(n()), sep = "_")) %>%
spread(sample, Cq)
Upvotes: 0