kputschko
kputschko

Reputation: 816

How to standardize strings in R

I have a series of files from different time periods. These data have similar column names, but differences in how the values are formatted.

a <- c("123-OldStyle", "123.Old Style", "(123) Old Style"))

I can do this manually with a dplyr::recode, I suppose, but I can't get this recode to work with a vector.

dplyr::recode(a, "123-OldStyle" = "123 New Style")

vec_recode <- c("123-OldStyle" = "123 New Style", "123.Old Style" = "123 New Style", "(123) Old Style" = "123 New Style"))

I could do this with a long ifelse, but I'd rather not do this all manually.

Upvotes: 1

Views: 409

Answers (2)

Duck
Duck

Reputation: 39613

You could also clean the string in this way:

#Code
gsub('OldStyle','Old Style',gsub('[[:punct:] ]+',' ',a))

Output:

[1] "123 Old Style"  "123 Old Style"  " 123 Old Style"

Upvotes: 2

akrun
akrun

Reputation: 887831

We can use !!! on a named vector with the recode to convert it to New Style

dplyr::recode(a, !!! vec_recode)

-output

#[1] "123 New Style" "123 New Style" "123 New Style"

Also, in base R, we can just do the matching with [ as we have a named vector

unname(vec_recode[a])
#[1] "123 New Style" "123 New Style" "123 New Style"

Upvotes: 2

Related Questions