Luke
Luke

Reputation: 4929

Replace specific characters within strings

I would like to remove specific characters from strings within a vector, similar to the Find and Replace feature in Excel.

Here are the data I start with:

group <- data.frame(c("12357e", "12575e", "197e18", "e18947")

I start with just the first column; I want to produce the second column by removing the e's:

group       group.no.e
12357e      12357
12575e      12575
197e18      19718
e18947      18947

Upvotes: 345

Views: 792405

Answers (8)

ypa y yhm
ypa y yhm

Reputation: 219

You can use gsub or stringr.

Or, this:

library (magrittr); 

#' @author y.ypa.yhm
#' @license agpl-3.0
#' 

char.apart = 
function (str) str %>% nchar %>% {.+1} %>% seq %>% sample(1) %>% intToUtf8 %>% 
    {if (! (. %in% strsplit(str,"")[[1]])) . else char.apart (str)} ;

strtr = `%strtr%` = 
function (old, new) 
function (strs) (\ (rchar) strs %>% 
    paste0 (rchar) %>% strsplit(old) %>% 
    lapply (\ (s) s %>% paste (collapse = new)) %>% 
    unlist %>% substr(., 0, nchar(.) - 1) %>% 
    `names<-` (strs) 
    ) (old %>% char.apart) ;

#' @examples
#' 
#' `c("aaa bbb CCC ddd bb CC", "bb CC eee 1bb CCC PPP") %>% ("bb CC" %strtr% "tt TT")`
#' 
#' should out: 
#'   aaa bbb CCC ddd bb CC   bb CC eee 1bb CCC PPP 
#' "aaa btt TTC ddd tt TT" "tt TT eee 1tt TTC PPP"
#' 

use like:

c("12357e"
, "12575e"
, "197e18"
, "e18947") %>% 
    
    ("e" %strtr% "")

out:

 12357e  12575e  197e18  e18947 
"12357" "12575" "19718" "18947"

This way have no regex feature, and need no more libraries (you can just replace the magrittr pipe to native pipe).


Tested on webR REPL app

Upvotes: 0

user2110417
user2110417

Reputation:

You can use chartr as well:

group$group.no.e <- chartr("e", "", group$group)

Upvotes: 2

Anya Sti
Anya Sti

Reputation: 141

> library(stringi)                
> group <- c('12357e', '12575e', '12575e', ' 197e18',  'e18947')              
> pattern <- "e"  
> replacement <-  ""  
> group <- str_replace(group, pattern, replacement)      
> group 
[1] "12357"  "12575"  "12575"  " 19718" "18947" 

Upvotes: 0

Alexander Borochkin
Alexander Borochkin

Reputation: 4611

You do not need to create data frame from vector of strings, if you want to replace some characters in it. Regular expressions is good choice for it as it has been already mentioned by @Andrie and @Dirk Eddelbuettel.

Pay attention, if you want to replace special characters, like dots, you should employ full regular expression syntax, as shown in example below:

ctr_names <- c("Czech.Republic","New.Zealand","Great.Britain")
gsub("[.]", " ", ctr_names)

this will produce

[1] "Czech Republic" "New Zealand"    "Great Britain" 

Upvotes: 25

MERose
MERose

Reputation: 4421

Use the stringi package:

require(stringi)

group<-data.frame(c("12357e", "12575e", "197e18", "e18947"))
stri_replace_all(group[,1], "", fixed="e")
[1] "12357" "12575" "19718" "18947"

Upvotes: 7

Megatron
Megatron

Reputation: 17089

Summarizing 2 ways to replace strings:

group<-data.frame(group=c("12357e", "12575e", "197e18", "e18947"))

1) Use gsub

group$group.no.e <- gsub("e", "", group$group)

2) Use the stringr package

group$group.no.e <- str_replace_all(group$group, "e", "")

Both will produce the desire output:

   group group.no.e
1 12357e      12357
2 12575e      12575
3 197e18      19718
4 e18947      18947

Upvotes: 41

Dirk is no longer here
Dirk is no longer here

Reputation: 368191

Regular expressions are your friends:

R> ## also adds missing ')' and sets column name
R> group<-data.frame(group=c("12357e", "12575e", "197e18", "e18947"))  )
R> group
   group
1 12357e
2 12575e
3 197e18
4 e18947

Now use gsub() with the simplest possible replacement pattern: empty string:

R> group$groupNoE <- gsub("e", "", group$group)
R> group
   group groupNoE
1 12357e    12357
2 12575e    12575
3 197e18    19718
4 e18947    18947
R> 

Upvotes: 58

Andrie
Andrie

Reputation: 179398

With a regular expression and the function gsub():

group <- c("12357e", "12575e", "197e18", "e18947")
group
[1] "12357e" "12575e" "197e18" "e18947"

gsub("e", "", group)
[1] "12357" "12575" "19718" "18947"

What gsub does here is to replace each occurrence of "e" with an empty string "".


See ?regexp or gsub for more help.

Upvotes: 493

Related Questions