Reputation: 427
I want to do a simple replacement using gsub() function in R. See example:
#I want:
Huiswaard 2 Oost
Huiswaard 1 Zuid
Huiswaard 2 West
#To become:
Huiswaard-2-Oost
Huiswaard-1-Oost
Huiswaard-2-Oost
By means of the magnificent method of trial & error I tried this:
data <- gsub('Huiswaard\\s.\\s>*', "Huiswaard-.-", df)
data <- gsub('Huiswaard\\s.\\s>*', "Huiswaard-.*-", df)
data <- gsub('Huiswaard\\s.\\s>*', "Huiswaard-(.)-", df)
data <- gsub('Huiswaard\\s.\\s>*', "Huiswaard-\\(\\)-", df)
All not working. I end up with stuff like:
Huiswaard-.-West
Does anyone have an idea of how you can use gsub to skip an character in the replacement argument?
Upvotes: 0
Views: 632
Reputation: 2283
In regex you can group with parenthesis and back-reference with \\1
data <- gsub('Huiswaard\\s(\\d)\\s>*', "Huiswaard-\\1-", df)
data
[1] "Huiswaard-2-Oost" "Huiswaard-1-Zuid" "Huiswaard-2-West"
If you want to change the suffix, you could also capture the second word with \\w+
which will capture 1 or more word characters after the space.:
data <- gsub('Huiswaard\\s(\\d)\\s\\w+', "Huiswaard-\\1-Oost", df)
data
[1] "Huiswaard-2-Oost" "Huiswaard-1-Oost" "Huiswaard-2-Oost"
I use this cheat sheet to help me understand regular expressions: https://www.rstudio.com/wp-content/uploads/2016/09/RegExCheatsheet.pdf
Upvotes: 3