Deepak
Deepak

Reputation: 55

How to do regular expression match and replacement in R for only string which has alphanumeric characters?

I have a dataset which has values like "00MOC00281" and also values like "000001". I would like to remove leading zeroes only from "00MOC00281" which should become "MOC00281" and "000001" remains as it is.

I am trying to use gsub in R like below :

Command: gsub("^0{2}(*[A-Z])", "", "00MOC0012B")

Output : "OC0012B"

Any help appreciated.

Upvotes: 1

Views: 51

Answers (1)

acylam
acylam

Reputation: 18701

We can use positive lookahead. This regex only matches the leading zeros if they are followed by an "M". Since lookarounds are zero-length assertions, "M" is not part of the match:

sub("^0+(?=[A-Z])", "", c("00MOC0012B", "000001"), perl = TRUE)

# [1] "MOC0012B" "000001"

Upvotes: 3

Related Questions