user976991
user976991

Reputation: 411

R regular expression matching with reserved characters

I want to process lines of my data frame, I got this pattern in each line

x= RPA4|RP1-117P191

and I want this

 RPA4

Everything after the pipe removed

I tried with gsub, trying to get only the first part of the match

 gsub("^(\\.+)|*$", "\\1", x)

and I got the same. Could you help me please?

Thanks in advance

Upvotes: 1

Views: 107

Answers (1)

user1981275
user1981275

Reputation: 13372

Try this:

gsub("\\|.*", "", x)

this replaces everything after | with an empty string.

You used \\.+ which matches the character . instead of any character. Also you use .+ which is greedy, so you should use .+? to not capture everything until the end. The pipe character | means "or" in a regular expression, so you need to escape with \\| to match the actual character.

Another solution (closer to your attempt) could be:

gsub("^(.+?)\\|.+", "\\1", x)

Upvotes: 2

Related Questions