Reputation: 7107
I have data in the form,
+PRODUCT NAME 6CT 144cl
+NEW PRODUCT NAME 72cl
Which has characters, spaces and numbers. I am only interested in the number 144
and 72
for example, however when I use grep
and extract just the numbers I also extract the number 6
from the first row.
How can I just extract the 144cl
and 72cl
(or all numbers "attached" to the cl
string? i.e. there is no white space between 144
and cl
Upvotes: 1
Views: 76
Reputation: 79228
It is advisable to have a look ahead to determine whether the next characters are indeed cl
.
Capturing all including the cl
part:
x="+PRODUCT NAME 6CT 144cl" "+NEW PRODUCT NAME 72cl"
sub('.*\\s((\\d+)(?>cl)).*','\\1',x,perl = T)
[1] "144cl" "72cl"
If there are more than one value with cl in the same line then you will have to use gsub
instead of sub
.
If you don't need the cl
part but only the numbers then the capturing parenthesis should not be included:
sub('.*\\s(\\d+)(?>cl).*','\\1',x,perl = T)
[1] "144" "72"
Upvotes: 1
Reputation: 4554
Try this:
string<-"+PRODUCT NAME 6CT 144cl"
gsub('.* (\\d+).*$','\\1',string)
[1] "144"
Upvotes: 1
Reputation: 5704
You can use stringr
and a positive lookahead:
stringr::str_extract("+PRODUCT NAME 6CT 144cl", "\\d+(?=cl)")
# [1] "144"
Upvotes: 2