melatonin15
melatonin15

Reputation: 2279

Efficient way of changing all occurrence of a specific value in a column of a data frame in R

I have a huge data frame(28987853 rows) of the form

head(ratRawData)
  ratGene        ratReplicate alignment RNAtype
1     C4b Thymus_M_GSM1328751         2     REG
2    Rpl4 Thymus_M_GSM1328751         4     REG
3    Dntt Thymus_M_GSM1328751         3     DUP
4  Sptbn1 Thymus_M_GSM1328751         2     DUP
5  Ndufb7 Thymus_M_GSM1328751         2     REG
6 Ndufb10 Thymus_M_GSM1328751         2     REV

Now, what I want to do is the change all the occurrence of DUP in RNAtype to REV. Since thyis data frame is quite big, I am wondering what's a good way of doing this. Thanks in advance!

Upvotes: 1

Views: 60

Answers (1)

Roman Luštrik
Roman Luštrik

Reputation: 70653

I did some timings.

> set.seed(357)
> rat.raw.data <- data.frame(col1 = sample(letters, 28987853, replace = TRUE),
+                            col2 = sample(1:10, 28987853, replace = TRUE),
+                            col3 = sample(LETTERS, 28987853, replace = TRUE),
+                            rna = sample(c("REG", "DUP", "REV"), 28987853, replace = TRUE))
> 
> 
> dusty <- rat.raw.data
> system.time({dusty$rna[dusty$rna == "DUP"] <-  "REV"})
   user  system elapsed 
   3.37    0.24    3.64 
> 
> akrun <- rat.raw.data
> system.time({akrun$rna[grepl("DUP", akrun$rna)]<- "REV"})
   user  system elapsed 
   5.06    0.04    5.18 
> 
> roman <- rat.raw.data
> system.time({levels(roman$rna) <- c("REV", "REG", "REV")})
   user  system elapsed 
   1.08    0.13    1.20 
> head(dusty)
  col1 col2 col3 rna
1    c    3    P REV
2    b    7    B REG
3    h    6    T REV
4    f    3    H REV
5    q    6    F REG
6    m    9    F REV
> head(akrun)
  col1 col2 col3 rna
1    c    3    P REV
2    b    7    B REG
3    h    6    T REV
4    f    3    H REV
5    q    6    F REG
6    m    9    F REV
> head(roman)
  col1 col2 col3 rna
1    c    3    P REV
2    b    7    B REG
3    h    6    T REV
4    f    3    H REV
5    q    6    F REG
6    m    9    F REV

Upvotes: 3

Related Questions