rnorouzian
rnorouzian

Reputation: 7517

replace numeric with a given replacement character in a data.frame in R

I have various types of data.frames each of which can include a long number somewhere in them. A and B are two real examples.

I was wondering how I can replace any numeric element in column ct with a given replacement.name?

Please see reproducible R code and desired output is below.

A <- data.frame(ct = c("C,0.839662447257384 - T,0.839662447257384", "No,C,0.44462447257384 - Yes,T,0444462447257384"))

B <- data.frame(ct = "0.822125181950509,C,Female - 0.822125181950509,T,Female")

replacement.name = "year"  # Put this in place of any numeric value in column `ct`


A.desired <- data.frame(ct = c("C,year - T,year", "No,C,year - Yes,T,year"))

B.desired <- data.frame(ct = "year,C,Female - year,T,Female")

Upvotes: 0

Views: 137

Answers (1)

akrun
akrun

Reputation: 887118

We can use gsub to remove digits along with . and replace with 'year'

A$ct <- gsub("[0-9.]+", "year", A$ct)
A$ct
#[1] "C,year - T,year"        "No,C,year - Yes,T,year"

B$ct <- gsub("[0-9.]+", "year", B$ct)
B$ct
#[1] "year,C,Female - year,T,Female"

The above solution there is a bug i.e. if there are . in other places, it could replace it. To avoid that

gsub("[0-9]+\\.[0-9]+", "year", B$ct)

If these are done on multiple datasets, we can create a function

f1 <- function(dat, colnm, replstr){
      dat[[colnm]] <- gsub("[0-9]+\\.[0-9]+", replstr, dat[[colnm]])
      dat
     }

f1(A, 'ct', 'year')

Upvotes: 1

Related Questions