chipsin
chipsin

Reputation: 675

Replacing numbers from alphanumeric strings

I have a large dataset with two sorts of labels. The first is of the form 'numeric_alphanumeric_alpha' and another which is 'alphanumeric_alpha'. I need to strip the numeric prefix from the first label so that it matches the second label. I know how to remove numbers from alphanumeric data (as below) but this would remove numbers that I need.

gsub('[0-9]+', '', x)

Below is an example of the two different labels I am encountered with well as the prefer

c('12345_F24R2_ABC', 'r87R2_DEFG')

Below is the desired output

c('F24R2_ABC', 'r87R2_DEFG')

Upvotes: 1

Views: 73

Answers (2)

TarJae
TarJae

Reputation: 79286

Your code a litte modified:

^[0-9]*.....starts with number followed by numbers

\\_ .... matches underscore

gsub('^[0-9]*\\_', '', x)
[1] "F24R2_ABC"  "r87R2_DEFG"

Upvotes: 1

benson23
benson23

Reputation: 19142

A simple regex can do it. ^ refers to the start of a string, \\d refers to any digits, + indicates one or more time it appears.

gsub("^\\d+_", "", c('12345_F24R2_ABC', 'r87R2_DEFG'), perl = T)

[1] "F24R2_ABC"  "r87R2_DEFG"

Upvotes: 2

Related Questions