Reputation: 481
In a datatable as follow:
CHR POS ID indv1 indv2 indv3
1 31439_T_A ENSG00000232325 1 1 2
1 31504_G_A ENSG00000242474 2 1 1
1 34713_A_C ENSG00000242474 1 0 1
1 34918_C_T ENSG00000242474 1 0 1
1 35119_G_A ENSG00000239715 0 0 1
For each group of unique IDs in column ID
, I want to build a dataframe in which the rows are the indv1, indv2, ...
and the other columns are the corresponding values of all variants, column POS
.
UPDATE: The rows in output will be indvs and columns will be POSs. The output should be prepared separately for each group of IDs in ID
column since the number of variations in POS
could be different for each IDs so it might be a list.
Any hints or suggestion would be appreciated.
Upvotes: 0
Views: 60
Reputation: 79164
Update due to OP comment: Maybe this:
library(tidyr)
library(dplyr)
df %>%
group_by(ID) %>%
pivot_longer(
cols = starts_with("indv")
) %>%
pivot_wider(
names_from = POS,
values_from = value
)
CHR ID name `31439_T_A` `31504_G_A` `34713_A_C` `34918_C_T` `35119_G_A`
<int> <chr> <chr> <int> <int> <int> <int> <int>
1 1 ENSG00000232325 indv1 1 NA NA NA NA
2 1 ENSG00000232325 indv2 1 NA NA NA NA
3 1 ENSG00000232325 indv3 2 NA NA NA NA
4 1 ENSG00000242474 indv1 NA 2 1 1 NA
5 1 ENSG00000242474 indv2 NA 1 0 0 NA
6 1 ENSG00000242474 indv3 NA 1 1 1 NA
7 1 ENSG00000239715 indv1 NA NA NA NA 0
8 1 ENSG00000239715 indv2 NA NA NA NA 0
9 1 ENSG00000239715 indv3 NA NA NA NA 1
First answer:
We could use pivot_longer
:
library(tidyr)
library(dplyr)
df %>%
group_by(ID) %>%
pivot_longer(
cols = starts_with("indv")
)
CHR POS ID name value
<dbl> <chr> <chr> <chr> <dbl>
1 1 31439_T_A ENSG00000232325 indv1 1
2 1 31439_T_A ENSG00000232325 indv2 1
3 1 31439_T_A ENSG00000232325 indv3 2
4 1 31504_G_A ENSG00000242474 indv1 2
5 1 31504_G_A ENSG00000242474 indv2 1
6 1 31504_G_A ENSG00000242474 indv3 1
7 1 34713_A_C ENSG00000242474 indv1 1
8 1 34713_A_C ENSG00000242474 indv2 0
9 1 34713_A_C ENSG00000242474 indv3 1
10 1 34918_C_T ENSG00000242474 indv1 1
11 1 34918_C_T ENSG00000242474 indv2 0
12 1 34918_C_T ENSG00000242474 indv3 1
13 1 35119_G_A ENSG00000239715 indv1 0
14 1 35119_G_A ENSG00000239715 indv2 0
15 1 35119_G_A ENSG00000239715 indv3 1
Upvotes: 2