Reputation: 11
I have a lot of records like "LISTA CIVICA | blablabla". They are character-class under the column "partito". I need to cut off the "| bla bla bla" in order to obtain for all the records just "LISTA CIVICA".
I need to obtain only LISTA CIVICA for all those records.
I tried this code but it does not work
gsub(pattern="",replacement = "LISTA CIVICA",ammcom$partito)
Upvotes: 0
Views: 71
Reputation: 11
A friend of mine found out how to fix my problem.
length(ammcom$partito[grep("^LISTA",ammcom$partito)])
L <- rep("LISTA CIVICA", 92033)
ammcom$partito[grep("^LISTA",ammcom$partito)] <- L
Upvotes: 1
Reputation: 11128
Another way could be using lookaround expression:
library(stringr)
trimws(str_replace_all(text,"\\|(?>.*)",""))
OR
trimws(str_replace_all(text,"\\|.*",""))
Output:
> trimws(str_replace_all(text,"\\|.*",""))
[1] "LISTA CIVICA" "LISTA CIVICA"
Input data:
text = c("LISTA CIVICA | INSIEME PER ALBERA","LISTA CIVICA | bla blabla")
Upvotes: 2
Reputation: 887891
We can use sub
to match zero or more spaces (\\s*
) followed by the |
(escape it as it is a metacharacter for OR (|
) followed by other characters (.*
) and replace it with blank (""
)
sub("\\s*\\|.*", "", str1)
#[1] "LISTA CIVICA" "LISTA CIVICA"
Or another option is regmatches/regexpr
trimws(regmatches(str1, regexpr("^[^|]+", str1)))
#[1] "LISTA CIVICA" "LISTA CIVICA"
str1 <- c("LISTA CIVICA | INSIEME PER ALBERA", "LISTA CIVICA | blablabla")
Upvotes: 2
Reputation: 2146
If the string you want to keep is always the before the |
, you can also split the string around |
and retain only the first element:
str1 <- "LISTA CIVICA | INSIEME PER ALBERA"
unlist(lapply(strsplit(str1,"\\|"), function(x) x[[1]]))
Upvotes: 0