Replace specific characters in a variable in data frame in R

Question

I want to replace all ,, -, ), ( and (space) with . from the variable DMA.NAME in the example data frame. I referred to three posts and tried their approaches but all failed.:

Replacing column values in data frame, not included in list

R replace all particular values in a data frame

Replace characters from a column of a data frame R

Approach 1

> shouldbecomeperiod <- c$DMA.NAME %in% c("-", ",", " ", "(", ")")
c$DMA.NAME[shouldbecomeperiod] <- "."

Approach 2

> removetext <- c("-", ",", " ", "(", ")")
c$DMA.NAME <- gsub(removetext, ".", c$DMA.NAME)
c$DMA.NAME <- gsub(removetext, ".", c$DMA.NAME, fixed = TRUE)

Warning message:
In gsub(removetext, ".", c$DMA.NAME) :
  argument 'pattern' has length > 1 and only the first element will be used

Approach 3

> c[c == c(" ", ",", "(", ")", "-")] <- "."

Sample data frame

> df
DMA.CODE                  DATE                   DMA.NAME       count
111         22 8/14/2014 12:00:00 AM               Columbus, OH     1
112         23 7/15/2014 12:00:00 AM Orlando-Daytona Bch-Melbrn     1
79          18 7/30/2014 12:00:00 AM        Boston (Manchester)     1
99          22 8/20/2014 12:00:00 AM               Columbus, OH     1
112.1       23 7/15/2014 12:00:00 AM Orlando-Daytona Bch-Melbrn     1
208         27 7/31/2014 12:00:00 AM       Minneapolis-St. Paul     1

I know the problem - gsub uses pattern and only first element . The other two approaches are searching the entire variable for the exact value instead of searching within value for specific characters.

nrussell · Accepted Answer

You can use the special groups [:punct:] and [:space:] inside of a pattern group ([...]) like this:

df <- data.frame(
  DMA.NAME = c(
    "Columbus, OH",
    "Orlando-Daytona Bch-Melbrn",
    "Boston (Manchester)",
    "Columbus, OH",
    "Orlando-Daytona Bch-Melbrn",
    "Minneapolis-St. Paul"),
  stringsAsFactors=F)
##
> gsub("[[:punct:][:space:]]+","\.",df$DMA.NAME)
[1] "Columbus.OH"                "Orlando.Daytona.Bch.Melbrn" "Boston.Manchester."         "Columbus.OH"               
[5] "Orlando.Daytona.Bch.Melbrn" "Minneapolis.St.Paul"

Replace specific characters in a variable in data frame in R

Answers (2)

Related Questions