Frank Wang
Frank Wang

Reputation: 1620

How to convert character of percent into numeric in R

I have data with percent signs (%) that I want to convert into numeric. I run into a problem when converting character of percentage to numeric. E.g. I want to convert "10%" into 10%, but

as.numeric("10%")

returns NA. Do you have any ideas?

Upvotes: 48

Views: 78418

Answers (6)

nanselm2
nanselm2

Reputation: 1497

I wanted to convert an entire column and combined the above answers.

pct_to_number<- function(x){
  x_replace_pct<-sub("%", "", x)
  x_as_numeric<-as.numeric(x_replace_pct)
  }
df[['ColumnName']] = pct_to_number(df[['ColumnName']])

Upvotes: 5

Giora Simchoni
Giora Simchoni

Reputation: 3689

If you're a tidyverse user (and actually also if not) there's now a parse_number function in the readr package:

readr::parse_number("10%")

The advantage is generalization to other common string formats such as:

parse_number("10.5%")
parse_number("$1,234.5")

Upvotes: 27

Ari B. Friedman
Ari B. Friedman

Reputation: 72759

Get rid of the extraneous characters first:

topct <- function(x) { as.numeric( sub("\\D*([0-9.]+)\\D*","\\1",x) )/100 }
my.data <- paste(seq(20)/2, "%", sep = "")
> topct( my.data )
 [1] 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 0.045 0.050 0.055 0.060 0.065 0.070 0.075 0.080
[17] 0.085 0.090 0.095 0.100

(Thanks to Paul for the example data).

This function now handles: leading non-numeric characters, trailing non-numeric characters, and leaves in the decimal point if present.

Upvotes: 8

Galled
Galled

Reputation: 4206

Try with:

> x = "10%"
> as.numeric(substr(x,0,nchar(x)-1))
[1] 10

This works also with decimals:

> x = "10.1232%"
> as.numeric(substr(x,0,nchar(x)-1))
[1] 10.1232

The idea is that the symbol % is always at the end of the string.

Upvotes: 2

Joshua Ulrich
Joshua Ulrich

Reputation: 176688

Remove the "%", convert to numeric, then divide by 100.

x <- c("10%","5%")
as.numeric(sub("%","",x))/100
# [1] 0.10 0.05

Upvotes: 34

Paul Hiemstra
Paul Hiemstra

Reputation: 60964

10% is per definition not a numeric vector. Therefore, the answer NA is correct. You can convert a character vector containing these numbers to numeric in this fashion:

percent_vec = paste(1:100, "%", sep = "")
as.numeric(sub("%", "", percent_vec))

This works by using sub to replace the % character by nothing.

Upvotes: 69

Related Questions