user288609
user288609

Reputation: 13025

How to extract substring from a string?

There are some strings which show the following pattern

ABC, DEF.JHI
AB,DE.(JH)

Generally, it includes three sections which are separated with , and . The last character can be either normal character or sth like ). I would like to extract the last part. For example, I would like to generate the following two strings based on the above ones

JHI
(JH)

Is there a way to do that in R?

Upvotes: 0

Views: 231

Answers (4)

user20650
user20650

Reputation: 25854

Riffing on @josiber's answer you could remove the part of the string before the .

str1 <- c("ABC, DEF.JHI","AB,DE.(JH)")

gsub(".*\\.", "", str1)
# [1] "JHI"  "(JH)"

EDIT

In case your third element is not always preceded by a ., to extract the final part

str1 <- c("ABC, DEF.JHI","AB,DE.(JH)", "ABC.DE, (JH)")

gsub(".*[,.]", "" , str1)
# [1] "JHI"   "(JH)"  " (JH)"

Upvotes: 1

Tyler Rinker
Tyler Rinker

Reputation: 109874

Here's another possibility:

sapply(strsplit(str1, "\\.\\(|\\.|\\)"), "[[", 2)

Upvotes: 1

josliber
josliber

Reputation: 44320

You can just split on the . using strsplit and extract the second element.

str1 <- c("ABC, DEF.JHI","AB,DE.(JH)")
unlist(lapply(strsplit(str1, "\\."), "[", 2))
# [1] "JHI"  "(JH)"

Upvotes: 1

akrun
akrun

Reputation: 887118

library(stringr)
 str1 <- c("ABC, DEF.JHI","AB,DE.(JH)")
 str_extract(str1,perl('(?<=\\.).*'))
#[1] "JHI"  "(JH)"

(?<=\\.) search for . followed by .* all characters

Upvotes: 1

Related Questions