Reputation: 11
I have a table named data0
Sample Gene Frequency
1 sample1 gene1 18
2 sample1 gene2 1
3 sample1 gene3 1
4 sample1 gene4 14
5 sample1 gene5 8
6 sample2 gene1 7
7 sample2 gene4 10
8 sample2 gene5 4
9 sample3 gene1 10
10 sample3 gene3 3
11 sample3 gene6 1
12 sample3 gene4 9
I need to create another table named data1
where the Column Name corresponds to the rows of one of the columns (gene). The result should be:
Sample Gene1 Gene2 Gene3 Gene4 Gene5 Gene6
1 Sample 1 18 1 1 14 8 NA
2 Sample 2 7 NA NA 10 4 NA
3 Sample 3 10 NA 3 9 NA 1
Upvotes: 0
Views: 178
Reputation: 887068
We can use dcast
from data.table
library(data.table)
dcast(setDT(data0), Sample ~ Gene)
# Sample gene1 gene2 gene3 gene4 gene5 gene6
#1: sample1 18 1 1 14 8 NA
#2: sample2 7 NA NA 10 4 NA
#3: sample3 10 NA 3 9 NA 1
Or with xtabs
from base R
xtabs(Frequency ~ Sample + Gene, data0)
data0 <- structure(list(Sample = c("sample1", "sample1", "sample1", "sample1",
"sample1", "sample2", "sample2", "sample2", "sample3", "sample3",
"sample3", "sample3"), Gene = c("gene1", "gene2", "gene3", "gene4",
"gene5", "gene1", "gene4", "gene5", "gene1", "gene3", "gene6",
"gene4"), Frequency = c(18L, 1L, 1L, 14L, 8L, 7L, 10L, 4L, 10L,
3L, 1L, 9L)), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12"))
Upvotes: 0
Reputation: 24790
One approach is pivot_wider
from tidyr
:
library(tidyr)
data1 <- pivot_wider(data0,names_from = Gene, values_from = Frequency)
data1
## A tibble: 3 x 7
# Sample gene1 gene2 gene3 gene4 gene5 gene6
# <fct> <int> <int> <int> <int> <int> <int>
#1 sample1 18 1 1 14 8 NA
#2 sample2 7 NA NA 10 4 NA
#3 sample3 10 NA 3 9 NA 1
If you're really set on the column names, you could fix the Gene column first, with mutate:
library(dplyr)
data1 <- data0 %>%
mutate(Gene = paste0(toupper(substr(Gene, 1, 1)), substr(Gene, 2, nchar(as.character(Gene))))) %>%
pivot_wider(names_from = Gene, values_from = Frequency)
data1
## A tibble: 3 x 7
# Sample Gene1 Gene2 Gene3 Gene4 Gene5 Gene6
# <fct> <int> <int> <int> <int> <int> <int>
#1 sample1 18 1 1 14 8 NA
#2 sample2 7 NA NA 10 4 NA
#3 sample3 10 NA 3 9 NA 1
Upvotes: 1