Brad Cannell
Brad Cannell

Reputation: 3200

setNames suffix to prefix

I have a dataset that includes a bunch of variables with various suffixes that I want to make into prefixes. The dataset also includes some variables without any suffixes. Something like:

df <- data.frame(
  home_loc   = rnorm(5),
  work_loc   = rnorm(5),
  x1         = rnorm(5),
  walk_act   = rnorm(5),
  bike_act   = rnorm(5),
  x2         = rnorm(5),
  happy_yest = rnorm(5),
  sad_yest   = rnorm(5)
)

I was able to come up with the following solution:

suff_to_pre <- function(x, suffix, prefix) {
  for (i in seq_along(names(x))) {
    if (grepl(suffix, names(x)[i])) {
      names(x)[i] <- sub(suffix, "", names(x)[i])
      names(x)[i] <- paste0(prefix, names(x)[i])
    }
  }
  names(x)
}

names(df) <- suff_to_pre(df, suffix = "_loc", prefix = "loc_")
names(df) <- suff_to_pre(df, suffix = "_act", prefix = "act_")
names(df) <- suff_to_pre(df, suffix = "_yest", prefix = "yest_")

names(df)
[1] "loc_home" "loc_work" "x1" "act_walk" "act_bike" "x2" "yest_happy"
[8] "yest_sad"

But, I’m not feeling very satisfied with it. Specifically, I would really like a way to get the same result using dplyr. I found this and this, which got me to:

a <- df %>%
  select(ends_with("_loc")) %>%
  setNames(sub("_loc", "", names(.))) %>%
  setNames(paste0("loc_", names(.)))

b <- df %>%
  select(ends_with("_act")) %>%
  setNames(sub("_act", "", names(.))) %>%
  setNames(paste0("act_", names(.)))

c <- df %>%
  select(ends_with("_yest")) %>%
  setNames(sub("_yest", "", names(.))) %>%
  setNames(paste0("yest_", names(.)))

df <- cbind(
  select(df, x1, x2), a, b, c
)

Which is obviously not ideal. I was hoping someone out there suggest a more elegant solution using dplyr.

Edit
@docendo discimus and @zx8754 gave really helpful answers, but I should have been more explicit. I also have variables that include underscores, but are not suffixes that I want to change into prefixes.

For Example (see free_time):

df <- data.frame(
      home_loc   = rnorm(5),
      work_loc   = rnorm(5),
      x_1        = rnorm(5),
      walk_act   = rnorm(5),
      bike_act   = rnorm(5),
      x_2        = rnorm(5),
      happy_yest = rnorm(5),
      sad_yest   = rnorm(5),
      free_time  = rnorm(5)
)

Upvotes: 2

Views: 935

Answers (2)

talat
talat

Reputation: 70286

A single sub call should be enough:

sub("^(.*)_(.*)$", "\\2_\\1", names(df))
#[1] "loc_home"   "loc_work"   "x1"         "act_walk"   "act_bike"   "x2"         "yest_happy" "yest_sad" 

And of course to change the names, assign it back:

names(df) <- sub("^(.*)_(.*)$", "\\2_\\1", names(df))

And in a dplyr-pipe you could use setNames:

df %>% setNames(sub("^(.*)_(.*)$", "\\2_\\1", names(.)))

The pattern "^(.*)_(.*)$" creates two capturing groups, one before the underscore and one after it. And in the replacement "\\2_\\1" we tell R to extract the second group first, then an underscore and finnaly the first group which makes suffixes prefixes. However, if the pattern with an underscore is not found in an entry, nothing is changed.

Update after Question update:

For the slightly more complicated case, you can do the following:

1) store all suffixes that need to be changed to prefixes:

suf <- c("act", "loc", "yest")

2) create a regular expression pattern based on the suffixes:

pat <- paste0("^(.*)_(", paste(suf, collapse = "|"), ")$")
pat
#[1] "^(.*)_(act|loc|yest)$"

3) proceed as before:

sub(pat, "\\2_\\1", names(df))
# [1] "loc_home"   "loc_work"   "x_1"        "act_walk"   "act_bike"   "x_2"        "yest_happy" "yest_sad"   "free_time" 

or

df %>% setNames(sub(pat, "\\2_\\1", names(.)))

Upvotes: 4

akrun
akrun

Reputation: 887231

We can use str_replace from stringr. Here, the idea is to use capture the patterns as a group i.e. within the (..). THe first capture group (([^_])*) indicates zero or more characters that are not _ followed by _ and followed by another capture group (([^_])) and in the replacement we just switch the backreference.

 library(stringr)
 names(df) <- str_replace(names(df), "^([^_]*)_([^_]*)$", "\\2_\\1")
 names(df)
 #[1] "loc_home"   "loc_work"   "x1"         "act_walk" 
 #[5] "act_bike"   "x2"         "yest_happy" "yest_sad"  

If we need to use this with pipes

library(magrittr)
df %<>%
    setNames(str_replace(names(.), "^([^_]*)_([^_]*)$", "\\2_\\1"))

Or without using any regex

sapply(sapply(strsplit(names(df), "_"), rev), paste, collapse="_")

Upvotes: 1

Related Questions