Reputation: 223
I am a new R user and I have the following problem:
I have data in two columns. First column contains markers and the second column contains the genotypes. Each genotype has say 4 markers. Thus, I have in the first column 4 times the name of genotype 1 and the according 4 markers, then genotype 2 follows with exact the same 4 markers and so on. But I want the markers in one column and the genoytpes each in one seperate column, so I can compare the markers across the genoytypes. I have no idea how I could do it.
G1 has 4 markers, G2 has the same 4 markers etc.:
-Marker Genotype
M1 G1
M2 G1
M3 G1
M4 G1
M1 G2
M2 G2
M3 G2
M4 G2
M1 G3
M2 G3
M3 G3
M4 G3
And I want R to to this:
Marker G1 G2 G3
M1 AA AA GG
M2 TT GG CC
M3 GG AA AA
M4 CC TT GG
Put each genotype in one column so that the comparison of markers is very easy.
Has someone a bright idea of how that could work?
Thanks very much in advance. Marie
Upvotes: 2
Views: 911
Reputation: 6784
You want some sort of cast
. For example
require(reshape2)
indata <- data.frame( Marker = rep(c("M1","M2","M3","M4"), 3),
Genotype = rep(c("G1","G2","G3"), each=4),
value = c("AA","TT","GG","CC","AA","GG","AA","TT","GG","CC","AA","GG") )
outdata <- dcast(indata, Marker ~ Genotype)
will take you from
> indata
Marker Genotype value
1 M1 G1 AA
2 M2 G1 TT
3 M3 G1 GG
4 M4 G1 CC
5 M1 G2 AA
6 M2 G2 GG
7 M3 G2 AA
8 M4 G2 TT
9 M1 G3 GG
10 M2 G3 CC
11 M3 G3 AA
12 M4 G3 GG
to
> outdata
Marker G1 G2 G3
1 M1 AA AA GG
2 M2 TT GG CC
3 M3 GG AA AA
4 M4 CC TT GG
Upvotes: 1