Nirvik Banerjee
Nirvik Banerjee

Reputation: 335

Return the first occurrence of a character in a string

I have been trying to extract a portion of string after the occurrence of a first ^ sign. For example, the string looks like abc^28092015^def^1234. I need to extract 28092015 sandwiched between the 1st two ^ signs.

So, I need to extract 8 characters from the occurrence of the 1st ^ sign. I have been trying to extract the position of the first ^ sign and then use it as an argument in the substr function.

I tried to use this:

x=abc^28092015^def^1234 `rev(gregexpr("\\^", x)[[1]])[1]`

Referring the answer discussed here.

But it continues to return the last position. Can anyone please help me out?

Upvotes: 2

Views: 4291

Answers (5)

james jelo4kul
james jelo4kul

Reputation: 829

It would be better if you split it using ^. But if you still want the pattern, you can try this.

^\S+\^(\d+)(?=\^)

Then match group 1.

OUTPUT

28092015

See DEMO

Upvotes: 1

akrun
akrun

Reputation: 887981

Another option is stri_extract_first from library(stringi)

library(stringi)
stri_extract_first_regex(str1, '(?<=\\^)\\d+(?=\\^)')
#[1] "28092015"

If it is any character between two ^

stri_extract(str1, regex='(?<=\\^)[^^]+')
#[1] "28092015"

data

str1 <- 'abc^28092015^def^1234'

Upvotes: 2

Avinash Raj
Avinash Raj

Reputation: 174874

I would use sub.

x <- "^28092015^def^1234"
sub("^.*?\\^(.*?)\\^.*", "\\1", x)
# [1] "28092015"

Since ^ is a special char in regex, you need to escape that in-order to match literal ^ symbols.

or

Do splitting on ^ and get the value of second index.

strsplit(x,"^", fixed=-T)[[1]][2]
# [1] "28092015"

or

You may use gsub aslo.

gsub("^.*?\\^|\\^.*", "", x, perl=T)
# [1] "28092015"

Upvotes: 4

nrussell
nrussell

Reputation: 18612

Here's one option with base R:

x <- "abc^28092015^def^1234"
m <- regexpr("(?<=\\^)(.+?)(?=\\^)", x, perl = TRUE)
##
R> regmatches(x, m)
#[1] "28092015"

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 389335

x <- 'abc^28092015^def^1234'
library(qdapRegex)
unlist(rm_between(x, '^', '^', extract=TRUE))[1]
# [1] "28092015"

Upvotes: 1

Related Questions