Giulia
Giulia

Reputation: 3

from long to wide format multiple variables in R

I have a table in long format like this:

gene  tissue tpm
  A   liver   5
  A   brain   2
  B   ovary   10
  B   brain   1
  C   brain   15
  C   liver   6

I'd like to convert it into a wider format:

gene tissue1 tissue2 tpm1 tpm2
  A  liver   brain    5    2
  B  ovary   brain    10   1
  C  brain   liver    15   6

I have tried with dcast and spread but I get this result:

gene  liver brain ovary
 A      5     2     NA
 B      NA    1     10
 C      6     15    NA

Which is NOT what I want.

Thank you!

Upvotes: 0

Views: 42

Answers (1)

Aliton Oliveira
Aliton Oliveira

Reputation: 1349

I am not aware of a function that can solve this puzzle all at once in R language, but you can use a for loop to rearrange you data frame.

The code is presented below:

data <- data.frame(gene=c("A","A","B","B","C","C"),
                tissue=c("liver", "brain", "ovary", "brain", "brain", "liver"),
                tpm=c(5,2,10,1,15,6))

gene.unique <- unique(data$gene)
i <- 1
for (dummy in gene.unique) {
  genes.idx <- which(data$gene == dummy)
  tissue1[i] <- data$tissue[genes.idx[1]]
  tissue2[i] <- data$tissue[genes.idx[2]]
  tpm1[i] <- data$tpm[genes.idx[1]]
  tpm2[i] <- data$tpm[genes.idx[2]]
  i <- i+1
}

data.final <- data.frame(gene=gene.unique, tissue1, tissue2, tpm1, tpm2)

  gene tissue1 tissue2 tpm1 tpm2
1    A   liver   brain    5    2
2    B   ovary   brain   10    1
3    C   brain   liver   15    6

I hope it helps you.

Upvotes: 0

Related Questions