Slug Pue
Slug Pue

Reputation: 256

Optionally matching symbol using lookbehind assertion

Let's say I have some strings like these:

strings <- c("Robert: my name is Robert", "Michael my name is Michael", "Jack: I like turtles")

I would like to make a regular-expression query which returns exactly

[1] "Robert" "Michael"

I.e. names followed by "my name" and removing any potential colon sign after the name. My attempt was:

regmatches(strings, regexpr(".*(?=:* my name)", strings, perl = T))

Here I try to optionally match the colon sign by writing

(?=:* my name)

However, in this case the colon sign doesn't seem to get caught by the lookbehind assertion (or does it?), and I get instead

[1] "Robert:" "Michael"

Is there some way to change the expression inside the lookbehind assertion (or outside it for that matter) in order to remove the colon from the results? Full code:

strings <- c("Robert: my name is Robert", "Michael my name is Michael",
             "Jack: I like turtles")
regmatches(strings, regexpr(".*(?=:* my name)", strings, perl = T))

Upvotes: 2

Views: 50

Answers (1)

Kamuffel
Kamuffel

Reputation: 642

I have made you a regex which omits the colon:

.*[^:](?=:* my name)

A demo can be found here:

http://rubular.com/r/RrQAhhIs1j

Upvotes: 2

Related Questions