how to delete the first word and the last in a column?

Question

I am trying to delete the first word and the last word in column CCGName, only with tidyverse, in R. The CCG column contains the word "NHS" with the city name followed by "CCG". I want to get rid of the words "NHS" and "CCG". Is there a way to do this only with tidyverse?

This is my sample of data:

structure(list(SiteType = c(111, 111, 111, 111, 111, 111, 111, 
111, 111, 111), `Call Date` = c("18/03/2020", "18/03/2020", "18/03/2020", 
"18/03/2020", "18/03/2020", "18/03/2020", "18/03/2020", "18/03/2020", 
"18/03/2020", "18/03/2020"), Gender = c("Female", "Female", "Female", 
"Female", "Female", "Female", "Female", "Female", "Female", "Female"
), AgeBand = c("0-18 years", "0-18 years", "0-18 years", "0-18 years", 
"0-18 years", "0-18 years", "0-18 years", "0-18 years", "0-18 years", 
"0-18 years"), CCGCode = c("E38000004", "E38000009", "E38000020", 
"E38000023", "E38000029", "E38000010", "E38000030", "E38000035", 
"E38000008", "E38000025"), CCGName = c("NHS Barking and Dagenham CCG", 
"NHS Bath and North East Somerset CCG", "NHS Brent CCG", "NHS Bromley CCG", 
"NHS Canterbury and Coastal CCG", "NHS Bedfordshire CCG", "NHS Castle Point and Rochford CCG", 
"NHS City and Hackney CCG", "NHS Bassetlaw CCG", "NHS Calderdale CCG"
), `April20 mapped CCGCode` = c("E38000004", "E38000231", "E38000020", 
"E38000244", "E38000237", "E38000010", "E38000030", "E38000035", 
"E38000008", "E38000025"), `April20 mapped CCGName` = c("NHS Barking and Dagenham CCG", 
"NHS Bath and North East Somerset, Swindon and Wiltshire CCG", 
"NHS Brent CCG", "NHS South East London CCG", "NHS Kent and Medway CCG", 
"NHS Bedfordshire CCG", "NHS Castle Point and Rochford CCG", 
"NHS City and Hackney CCG", "NHS Bassetlaw CCG", "NHS Calderdale CCG"
), TriageCount = c(35, 9, 21, 11, 11, 27, 12, 12, 6, 9)), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

Duck · Accepted Answer

You can also try:

library(dplyr)
#Code
df <- df %>% mutate(CCGName=trimws(gsub('NHS|CCG','',CCGName)))

Output:

df$CCGName
 [1] "Barking and Dagenham"         "Bath and North East Somerset"
 [3] "Brent"                        "Bromley"                     
 [5] "Canterbury and Coastal"       "Bedfordshire"                
 [7] "Castle Point and Rochford"    "City and Hackney"            
 [9] "Bassetlaw"                    "Calderdale"

You can also reach the same output with next code (many thanks and credit to @BenBolker):

#Code 2
df <- df %>% mutate(CCGName=str_remove("^NHS\s+|\s+CCG$",string = CCGName))

how to delete the first word and the last in a column?

Answers (2)

Related Questions