using gsub with a column on a dataframe

Question

I had a little problem with a column in my dataframe.

dat$Dx1
#[1]  F20.0 F13.2 F31.3 F33.1 F06.2 F34.0 F41.2 F31.7 F32.2 F41.0 F23.0
#[12] F33.1 F20.0 F41.0 F41.2 F34.1 F20.0 F20.0 F60.3 F32.1 F06.2 F20.0
#[23] F06.3 F41.2 F41.2 F20.0 F33.2 F42.2 F32.1 F20.0 F20.0 F20.0 F20.0

This is an example of the column. I want to erase all the "decimals" (they are not exactly decimals because they are characters) using gsub(), but when I write and run the code for all the ".1", ".2" like...

dat$Dx1 <- gsub(".1","",dat$Dx1)

It makes a mess like:

 #[1] "F"   "3"   "F"   "3"   "6"   "4"   "F"   "F"   "F"   "F"   "3"  
 #[12] "3"   "F"   "F"   "F"   "4"   "F"   "F"   "F"   "F"   "6"   "F"  
 #[23] "6"   "F"   "F"   "F"   "3"   "F"   "F"   "F"   "F"   "F"   "F"

I just want to erase it like F20.0 >> F20 in all the elements

Maybe I'm writing a bad code, can someone help me please?

Rich Scriven · Accepted Answer

You will need to escape the . with either \. or [.]. See ?regex. So the call becomes

sub("\..*", "", dat$Dx1)

For example,

x <- c("F20.0", "F13.2", "F31.3", "F33.1")
sub("\..*", "", x)
# [1] "F20" "F13" "F31" "F33"

We can use sub() instead of gsub() since we are always matching the first (and only) occurrence of ..

using gsub with a column on a dataframe

Answers (1)

Related Questions