littleworth
littleworth

Reputation: 5169

How to convert percentage text into numeric using dplyr pipe?

I have the following tibble:

library(tidyverse)
dat <- structure(list(V1 = c("Number of input reads", "Uniquely mapped reads number", 
"Uniquely mapped reads %", "Average mapped length"), V2 = c("26265603", 
"13330431", "50.75%", "47.37")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L))

It looks like this:

  V1                           V2      
  <chr>                        <chr>   
1 Number of input reads        26265603
2 Uniquely mapped reads number 13330431
3 Uniquely mapped reads %      50.75%  
4 Average mapped length        47.37 

What I want to do is to convert V2 column into numeric. The final result expected is this:

  V1                           V2      
  <chr>                        <dbl>   
1 Number of input reads        26265603
2 Uniquely mapped reads number 13330431
3 Uniquely mapped reads %      0.5075 
4 Average mapped length        47.37 

I tried this

dat %>%
mutate(V2 = case_when(V1 == "Uniquely mapped reads %" ~ as.numeric(sub("%","",V2))/100, 
                        TRUE ~ as.numeric(V2)))

but it gives me warning:

Warning message:
In eval_tidy(pair$rhs, env = default_env) : NAs introduced by coercion

What's the right way to do it?

Upvotes: 0

Views: 1242

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389155

This could be a bit convoluted using pipes since we want to update only few rows but in base R, we can first find out rows which has the specific string in it and update only those V2 values.

inds <- dat$V1 ==  "Uniquely mapped reads %"
dat$V2[inds] <- as.numeric(sub("%", "", dat$V2[inds]))/100

dat
# A tibble: 4 x 2
#  V1                           V2      
#  <chr>                        <chr>   
#1 Number of input reads        26265603
#2 Uniquely mapped reads number 13330431
#3 Uniquely mapped reads %      0.5075  
#4 Average mapped length        47.37 

A way using pipes can be

library(dplyr)

dat %>%
   mutate(V2 = as.numeric(sub("%", "", V2))/
               (c(1, 100)[(V1 == "Uniquely mapped reads %") + 1]))

Upvotes: 1

Related Questions