Reputation: 3568
In this SO post the accepted answer shows how to remove a prefix from a subset of column names. I will reproduce the toy data and solution and get to my issue. Note that I have altered the toy data by adding a suffix (_end
) to two of the variables.
df <- data.frame(ATH_V1 = rnorm(10), ATH_V2_end = rnorm(10), ATH_V3_end = rnorm(10), ATH_V4 = rnorm(10), ATH_V5 = rnorm(10), ATH_V6 = rnorm(10), ATH_V7 = rnorm(10))
df
# ATH_V1 ATH_V2_end ATH_V3_end ATH_V4 ATH_V5 ATH_V6 ATH_V7
# 1 -1.5520380 1.16782520 -0.3628090 1.5238728 -1.1660806 -1.01416226 -0.95163564
# 2 0.6270134 1.63810443 0.2199733 -0.6175186 -1.8909463 -0.23913125 -0.70650296
# 3 -0.7462879 0.08504734 0.6506818 -0.5436457 1.3369322 1.69883194 -1.07623124
# 4 0.3196569 0.95782069 -0.3454795 -1.7485607 2.3896003 1.24958489 -0.73316675
# 5 -0.8820414 -2.01739089 -0.5881156 1.2725712 1.4251221 0.56213069 -0.47188011
# 6 -0.5534390 1.48974625 -0.2532402 -1.2333677 1.6690452 -0.48178503 0.30727117
# 7 -0.4637729 -1.13762829 1.3072153 1.0082090 -1.7958189 -1.37604307 -0.08900913
# 8 -0.3878013 -1.09693619 -0.9022672 0.1809460 -1.0303186 0.54576930 -0.64634653
# 9 -0.9553941 0.91495814 -0.2993733 -0.5860527 -0.5623538 -0.24521585 0.21297231
# 10 2.2891475 0.05568124 -0.1718192 0.4249103 2.6009601 0.06357305 0.47794076
I would like to remove the ATH_
prefix ONLY from the columns that end with _end
.
Now the solution in the original post proposed the following code, where we specify the column names we want to operate on in a vector within rename_at
and then remove the ATH_
prefix via the str_remove
function, like so
df %>% rename_at(c("ATH_V2_end", "ATH_V3_end"), ~ .x %>% str_remove("^ATH_"))
# ATH_V1 V2_end V3_end ATH_V4 ATH_V5 ATH_V6 ATH_V7
# 1 1.14822123 -0.6285561 0.52458507 -0.63906454 1.1401342 -1.6559726 0.41732258
# 2 0.07519307 2.0090135 0.13440368 1.24337727 -0.2906335 -0.1349698 1.45647898
# 3 -0.87465492 -1.8766134 -0.17119197 -1.22701678 -0.7603659 0.1015543 -1.06211069
# 4 1.01402581 -0.4744169 0.78326842 -0.02910686 0.1548202 1.0042147 -0.23739832
# 5 1.00613252 -1.5701097 1.64415870 0.86733910 0.1558727 0.3011537 0.05700506
# 6 -1.01416351 -1.7687648 -0.13999833 -1.01482747 -0.5732621 -0.2504362 2.20762232
# 7 1.00861721 0.7494679 0.08853307 1.46402775 -0.1153655 0.8427913 -1.16114455
# 8 0.28117809 -0.6669487 -0.50816389 -0.12875270 0.7798111 -0.3937148 -1.30894602
# 9 -0.23092640 2.8516271 -1.36959691 -0.39303227 1.9862182 1.2378769 -1.66039502
# 10 0.65034202 0.9009923 0.58264859 0.50931251 1.7284268 1.8420746 -0.71894637
However the help for the new dplyr suite of packages states that rename_at
has been superseded by rename_with
and that you can use some of the powerful functionality of the select
functions to choose a subsets of columns.
So I would like to remove the ATH_
prefix ONLY from the columns that end with _end
using the ends_with()
function within rename_with()
using tidyverse grammar.
I tried
df %>%
select(ends_with("_end")) %>%
rename_with(str_remove(string = ~.x,
pattern = "^ATH_"))
and
df %>%
rename_with(cols = ends_with("_end"),
.fn = str_remove(string = ~.x,
pattern = "^ATH_"))
And got the same error
Error in `rename_with()`:
! Can't convert `.fn`, a character vector, to a function.
Any help much appreciated
Upvotes: 1
Views: 794
Reputation: 35554
You put the ~
symbol to a wrong place... It should be
df %>%
rename_with(.cols = ends_with("_end"),
.fn = ~ str_remove(string = .x, pattern = "^ATH_"))
ATH_V1 V2_end V3_end ATH_V4 ATH_V5 ATH_V6 ATH_V7
1 1.50743299 -0.445307241 0.8299688 0.17539549 -0.1327284 -0.3396151 0.51307888
2 -1.41938708 0.778638127 -0.2813838 -0.32856970 0.1652872 -0.3049578 0.94609307
3 0.67968358 -1.424279034 0.4743970 0.07742006 0.1302074 0.2824700 -0.62150878
4 1.37265457 0.626442526 -0.9043668 -1.26182381 -2.0965678 1.5024311 -0.13721899
5 1.56945505 -0.808444575 -0.6629072 -1.05412193 2.2763880 -2.0970344 -1.67471537
6 -1.33771537 1.610411569 0.3740234 1.08666291 0.4914622 0.2749874 3.37133643
7 -0.02463483 -0.008389356 0.7068729 -0.03796850 0.3389535 0.9763993 -0.34287204
8 0.31237309 0.011720063 0.1572582 -0.17382867 0.3284980 0.2716920 -0.07771273
9 -1.20628787 -0.654695991 -0.3015155 0.32320577 2.1091207 -0.2484013 -1.46188370
10 -0.56686265 -0.279659749 0.1913190 -1.58601761 -0.3031979 -1.2062704 -0.26730244
More concise expression is
df %>%
rename_with(~ str_remove(.x, "^ATH_"), ends_with("_end"))
and even
df %>%
rename_with(str_remove, ends_with("_end"), "^ATH_")
Upvotes: 2
Reputation: 18714
If you use select
to filter the columns, those columns will no longer be a part of the data frame. You're on the right track, though.
If you don't use the tilde with .x
to represent the dynamic field name, you have to use function
, literally.
For example, you can use the tilde, like this:
rename_with(df, .cols = ends_with("_end"),
~ gsub("^ATH_", "", .x))
Or you can designate a variable name of your choice, instead of .x
, and use function()
, like this:
rename_with(df, .cols = ends_with("_end"),
.fn = function(frenchFries) {
gsub("^ATH_", "", frenchFries)
})
You can use names()
to test your work before you change the object. The names()
function wasn't really intended for piping, but with a bit of finesse, it does the job.
rename_with(df, .cols = ends_with("_end"),
.fn = function(frenchFries) {
gsub("^ATH_", "", frenchFries)
}) %>% {names(.)}
# [1] "ATH_V1" "V2_end" "V3_end" "ATH_V4" "ATH_V5" "ATH_V6" "ATH_V7"
In R, very few libraries present objects as mutable or modified in place, so you have to assign this to an object to actually change it.
df <- rename_with(df, .cols = ends_with("_end"),
~ gsub("^ATH_", "", .x))
Upvotes: 1