marie
marie

Reputation: 223

Converting data- data in one column into several columns

I am a new R user and I have the following problem:

I have data in two columns. First column contains markers and the second column contains the genotypes. Each genotype has say 4 markers. Thus, I have in the first column 4 times the name of genotype 1 and the according 4 markers, then genotype 2 follows with exact the same 4 markers and so on. But I want the markers in one column and the genoytpes each in one seperate column, so I can compare the markers across the genoytypes. I have no idea how I could do it.

G1 has 4 markers, G2 has the same 4 markers etc.:

-Marker Genotype
M1  G1
M2  G1
M3  G1
M4  G1
M1  G2
M2  G2
M3  G2
M4  G2
M1  G3
M2  G3
M3  G3
M4  G3

And I want R to to this:

Marker  G1  G2  G3
M1  AA  AA  GG
M2  TT  GG  CC
M3  GG  AA  AA
M4  CC  TT  GG

Put each genotype in one column so that the comparison of markers is very easy.

Has someone a bright idea of how that could work?

Thanks very much in advance. Marie

Upvotes: 2

Views: 911

Answers (1)

Henry
Henry

Reputation: 6784

You want some sort of cast. For example

require(reshape2)

indata <- data.frame(  Marker = rep(c("M1","M2","M3","M4"), 3),
     Genotype = rep(c("G1","G2","G3"), each=4),
     value = c("AA","TT","GG","CC","AA","GG","AA","TT","GG","CC","AA","GG") )

outdata <- dcast(indata, Marker ~ Genotype)

will take you from

> indata
   Marker Genotype value
1      M1       G1    AA
2      M2       G1    TT
3      M3       G1    GG
4      M4       G1    CC
5      M1       G2    AA
6      M2       G2    GG
7      M3       G2    AA
8      M4       G2    TT
9      M1       G3    GG
10     M2       G3    CC
11     M3       G3    AA
12     M4       G3    GG

to

> outdata
  Marker G1 G2 G3
1     M1 AA AA GG
2     M2 TT GG CC
3     M3 GG AA AA
4     M4 CC TT GG

Upvotes: 1

Related Questions