Reputation: 165
I have a data frame with >1000 columns that are named like this
df <- data.frame(ABC.efg.Basketball_seasonxx = c(0, 3),
HIJK.LM.Baseball_season33 = c(5, 9))
ABC.efg.Basketball_seasonxx HIJK.LM.Baseball_season33
1 0 5
2 3 9
wished output:
Basketball Baseball
1 0 5
2 3 9
Using dplyr
, I want to change the name of all columns to "Basketball" or "Baseball" whenever they contain the string "Basketball" or "Baseball", regardless of what other strings or symbols are in there.
Upvotes: 2
Views: 636
Reputation: 887951
We may use sub
to capture the word (\\w+
) between the .
and the _
and replace with the backreference (\\1
) of the captured word
sub(".*\\.(\\w+)_.*", "\\1", names(df))
[1] "Basketball" "Baseball"
Or with stringr
library(stringr)
str_extract(names(df), "\\.(\\w+)_", group = 1)
[1] "Basketball" "Baseball"
Or if it is specific word, use
names(df) <- str_extract(names(df), "Bask?et?ball")
df
Basketball Baseball
1 0 5
2 3 9
Or if it is only to rename columns having 'Basketball' or Baseball in the column names
library(dplyr)
df %>%
rename_with(~ str_extract(.x, "Basketball|Baseball") ,
matches("Basketball|Baseball"))
-output
Basketball Baseball
1 0 5
2 3 9
Upvotes: 1