Zac
Zac

Reputation: 13

reorganize matrix or data frame in R

I have a data frame in R where the rows are gene names and the columns are gene ontology IDs, so that it looks like this:

Gene     V1    V2    V3      
Gene 1   GO1   GO2   GO3      
Gene 2   GO2   
Gene 3   GO2   GO3   

I'm trying to rearrange it so that the rows are unique gene ontology IDs, and each gene that matches those IDs is in a separate column in that row:

GO    V1     V2     V3
GO1   Gene1 
GO2   Gene1  Gene2  Gene 3
GO3   Gene1  Gene3 

I looked into reshape2, but it doesn't seem to be useful for this kind of reorganization. Is there a simple way to do this that I'm overlooking?

Thanks for the help!

Upvotes: 1

Views: 82

Answers (1)

akrun
akrun

Reputation: 887118

This can be done with melt/dcast. Convert the 'data.frame' to 'data.table' (setDT(df1)), reshape into 'long' format with melt (from data.table), remove the blank rows based on 'GO', and dcast from 'long' to 'wide'

library(data.table)
dcast(melt(setDT(df1), id.var = "Gene", value.name = "GO")[GO != ""], 
        GO ~ paste0("V", rowid(variable)), value.var = "Gene", fill="")
#    GO     V1     V2     V3
#1: GO1 Gene 1              
#2: GO2 Gene 1 Gene 2 Gene 3
#3: GO3 Gene 1 Gene 3       

Upvotes: 2

Related Questions