Ashwin
Ashwin

Reputation: 141

R : Extract a Specific Number out of a String

I have a vector as below

data <- c("6X75ML","24X37.5ML (KKK)", "6X2X75ML", "168X5CL (UUU)")

here i want to extract the first number before the "X" for each of the elements. In case of situations with 2 "X" i.e. "6X2X75CL" the number 12 (6 multiplied by 2) should be calculated.

expected output

6, 24, 12, 168

Thank you for the help...

Upvotes: 1

Views: 120

Answers (4)

user31264
user31264

Reputation: 6727

ind=regexpr("X",data)
val=as.integer(substr(data, 1, ind-1))
data2=substring(data,ind+1)
ind2=regexpr("[0-9]+X", data2)
if (!all(ind2!=1)) {
    val2 = as.integer(substr(data2[ind2==1], 1, attr(ind2,"match.length")[ind2==1]-1))
    val[ind2==1] = val[ind2==1] * val2
}

Upvotes: 1

akrun
akrun

Reputation: 886938

We can also use str_extract_all

library(stringr)
sapply(str_extract_all(data, "\\d+(?=X)"), function(x) prod(as.numeric(x)))
#[1]   6  24  12 168

Upvotes: 1

lmo
lmo

Reputation: 38500

Here is a method using base R:

dataList <- strsplit(data, split="X")
sapply(dataList, function(x) Reduce("*", as.numeric(head(x, -1))))
[1]   6  24  12 168

strplit breaks up the vector along "X". The resulting list is fed to sapply which the performs an operation on all but the final element of each vector in the list. The operation is to transform the elements into numerics and the multiply them. The final element is dropped using head(x, -1).

As @zheyuan-li comments, prod can fill in for Reduce and will probably be a bit faster:

sapply(dataList, function(x) prod(as.numeric(head(x, -1))))
[1]   6  24  12 168

Upvotes: 3

digEmAll
digEmAll

Reputation: 57210

Here's a possible solution using regular expressions :

data <- c("6X75ML","24X37.5ML (KKK)", "6X2X75ML", "168X5CL (UUU)")

# this regular expression finds any group of digits followed 
# by a upper-case 'X' in each string and returns a list of the matches
tokens <- regmatches(data,gregexpr('[[:digit:]]+(?=X)',data,perl=TRUE))

res <- sapply(tokens,function(x)prod(as.numeric(x)))
> res
[1]   6  24  12 168

Upvotes: 4

Related Questions