Reputation: 1036

R Remove everything before certain character in column names

I have data like this:

    data<-structure(list(record_id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), 
    fracture1_medial___1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1), fracture1_medial___2 = c(0, 
    0, 0, 0, 0, 0, 1, 0, 0, 0), fracture1_medial___5 = c(0, 0, 
    0, 0, 0, 0, 0, 0, 1, 0), fracture1_lateral___1 = c(0, 0, 
    0, 0, 0, 0, 0, 0, 0, 1), fracture1_lateral___2 = c(0, 0, 
    0, 0, 0, 0, 1, 0, 0, 0), fracture1_lateral___3 = c(0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0), fracture1_lateral___4 = c(0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0)), class = c("spec_tbl_df", "tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -10L), spec = structure(list(
    cols = list(record_id = structure(list(), class = c("collector_double", 
    "collector")), fracture1_medial___1 = structure(list(), class = c("collector_double", 
    "collector")), fracture1_medial___2 = structure(list(), class = c("collector_double", 
    "collector")), fracture1_medial___5 = structure(list(), class = c("collector_double", 
    "collector")), fracture1_lateral___1 = structure(list(), class = c("collector_double", 
    "collector")), fracture1_lateral___2 = structure(list(), class = c("collector_double", 
    "collector")), fracture1_lateral___3 = structure(list(), class = c("collector_double", 
    "collector")), fracture1_lateral___4 = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1L), class = "col_spec"))

And as you can see the columns have names like "fracture1_lateral_2". I'd like to remove everything before the first underline (and including that underline) so I'd be left with column names like "lateral_2". Not including the first column "record_id". In other words, my goal is this:

The real data has many more columns so it'd be ideal if I didn't have to write out each individual name. Tidyverse preferable but flexible. Thank you!

Upvotes: 0

Answers (3)

user2110417

Reputation:

You can try:

names(data) = gsub(pattern = ".*.1_", replacement="", x=names(data))

Upvotes: 0

Karthik S

Reputation: 11548

Does this work:

> library(dplyr)
> data %>% setNames(gsub('.*_([m|l].*__\\d)', '\\1',names(.)))
# A tibble: 10 x 8
   record_id medial___1 medial___2 medial___5 lateral___1 lateral___2 lateral___3 lateral___4
       <dbl>      <dbl>      <dbl>      <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
 1         1          0          0          0           0           0           0           0
 2         2          0          0          0           0           0           0           0
 3         3          0          0          0           0           0           0           0
 4         4          0          0          0           0           0           0           0
 5         5          0          0          0           0           0           0           0
 6         6          0          0          0           0           0           0           0
 7         7          0          1          0           0           1           0           0
 8         8          0          0          0           0           0           0           0
 9         9          0          0          1           0           0           0           0
10        10          1          0          0           1           0           0           0
>

Upvotes: 1

Gregor Thomas

Reputation: 146249

names(data)[-1] = sub(".*?_", "", names(data)[-1])
names(data)
# [1] "record_id"   "medial___1"  "medial___2"  "medial___5"  "lateral___1" "lateral___2" "lateral___3" "lateral___4"

Upvotes: 1

R Remove everything before certain character in column names

Answers (3)

Related Questions