user113156
user113156

Reputation: 7107

Extracting numbers from a column "attached" to characters

I have data in the form,

+PRODUCT NAME 6CT 144cl
+NEW PRODUCT NAME 72cl

Which has characters, spaces and numbers. I am only interested in the number 144 and 72 for example, however when I use grep and extract just the numbers I also extract the number 6 from the first row.

How can I just extract the 144cl and 72cl (or all numbers "attached" to the cl string? i.e. there is no white space between 144 and cl

Upvotes: 1

Views: 76

Answers (3)

Onyambu
Onyambu

Reputation: 79228

It is advisable to have a look ahead to determine whether the next characters are indeed cl.

Capturing all including the cl part:

x="+PRODUCT NAME 6CT 144cl" "+NEW PRODUCT NAME 72cl" 
sub('.*\\s((\\d+)(?>cl)).*','\\1',x,perl = T)
[1] "144cl" "72cl" 

If there are more than one value with cl in the same line then you will have to use gsub instead of sub.

If you don't need the cl part but only the numbers then the capturing parenthesis should not be included:

sub('.*\\s(\\d+)(?>cl).*','\\1',x,perl = T)
[1] "144" "72" 

Upvotes: 1

Shenglin Chen
Shenglin Chen

Reputation: 4554

Try this:

string<-"+PRODUCT NAME 6CT 144cl"
gsub('.* (\\d+).*$','\\1',string)
[1] "144"

Upvotes: 1

Scarabee
Scarabee

Reputation: 5704

You can use stringr and a positive lookahead:

stringr::str_extract("+PRODUCT NAME 6CT 144cl", "\\d+(?=cl)")
# [1] "144"

Upvotes: 2

Related Questions