HARJOT SINGH PARMAR
HARJOT SINGH PARMAR

Reputation: 99

removing particular character in a column in r

I have a table called LOAN containing column named RATE in which the observations are given in percentage for example 14.49% how can i format the table so that all value in rate are edited and % is removed from the entries so that i can use plot function on it .I tried using strsplit.

strsplit(LOAN$RATE,"%")

but got error non character argument

Upvotes: 9

Views: 76855

Answers (3)

odunayo12
odunayo12

Reputation: 585

This can be achieved using the mutate verb from the tidyverse package. Which in my opinion is more readable. So, to exemplify this, I create a dataset called LOAN with a focus on the RATE to mimic the problem above.

library(tidyverse)
LOAN <- data.frame("SN" = 1:4, "Age" = c(21,47,68,33), 
                   "Name" = c("John", "Dora", "Ali", "Marvin"),
                   "RATE" = c('16%', "24.5%", "27.81%", "22.11%"), 
                   stringsAsFactors = FALSE)
head(LOAN)
  SN Age   Name   RATE
1  1  21   John    16%
2  2  47   Dora  24.5%
3  3  68    Ali 27.81%
4  4  33 Marvin 22.11%

In what follows, mutate allows one to alter the column content, gsub does the desired substitution (of % with "") and as.numeric() converts the RATE column to numeric value, keeping the data cleaning flow followable.

LOAN <- LOAN %>% mutate(RATE = as.numeric(gsub("%", "", RATE)))
head(LOAN)
  SN Age   Name  RATE
1  1  21   John 16.00
2  2  47   Dora 24.50
3  3  68    Ali 27.81
4  4  33 Marvin 22.11

Upvotes: 5

Mohamed Refa&#39;t
Mohamed Refa&#39;t

Reputation: 1

Try:

LOAN$RATE <- sapply(LOAN$RATE, function(x), gsub("%", "",  x))

Upvotes: -1

IRTFM
IRTFM

Reputation: 263331

Items that appear to be character when printed but for which R thinks otherwise are generally factor classes objects. I'm also guessing that you are not going to be happy with the list output that strsplit will return. Try:

gsub( "%", "", as.character(LOAN$RATE) n)

Factors which are appear numeric can be a source of confusion as well:

> factor("14.9%")
[1] 14.9%
Levels: 14.9%
> as.character(factor("14.9%"))
[1] "14.9%"
> gsub("%", "", as.character(factor("14.9%")) )
[1] "14.9"

This is especially confusing since print.data.frame removes the quotes:

> data.frame(z=factor("14.9%"), zz=factor(14.9))
      z   zz
1 14.9% 14.9

Upvotes: 11

Related Questions