Reputation: 185
I am looking to return a specific group in the previous row via regex.
Suppose I have the following information and the target is to extract the value 90 on the basis of the differentiation in the following line.
QTY+66:90:PCE
SCC+2
DTM+45:20200416:15
QTY+66:60:PCE
SCC+3
DTM+35:20210614:2
If I were to traget the value 90, I'd have to look for the SCC+2
tag and if I were to loom for the value 60, it would be the SCC+3
tag.
I got this far in an attempt to return the value 90 (?<=^QTY\+66:)(\d+)(.*\n.*SCC\+2.*)
but it seems convoluted and I fail to extract only Group 1. Here is the link to regex101. I am using R for the actual application. Thanks for the help !
Upvotes: 2
Views: 196
Reputation: 627344
You can use
(?<=:)\d+(?=[^\d\r\n]*[\r\n]+.*SCC\+2)
See the regex demo. Details:
(?<=:)
- a :
must occur immediately to the left of the current location\d+
- one or more digits(?=[^\d\r\n]*[\r\n]+.*SCC\+2)
- immediately to the right, there must be[^\d\r\n]*
- any zero or more chars other than digits, CR and LF[\r\n]+
- one or more CR or LF chars.*SCC\+2
- any text on a line up to the rigthmost occurrence of SCC+2
.In R, you can use
library(stringr)
str_extract(vec, "(?<=:)\\d+(?=[^\\d\r\n]*[\r\n]+.*SCC\\+2)")
And a couple of base R approaches with sub
:
sub(".*?\\+\\d+:(\\d+)[^\r\n]*[\r\n]+[^\r\n]*SCC\\+2.*", "\\1", vec)
sub("(?s).*?\\+\\d+:(\\d+)(?-s).*\\R.*SCC\\+2(?s).*", "\\1", vec, perl=TRUE)
See regex 1 demo and regex 2 demo.
See the R demo online:
vec <- "QTY+66:90:PCE\nSCC+2\nDTM+45:20200416:15\nQTY+66:60:PCE\nSCC+3\nDTM+35:20210614:2"
sub(".*?\\+\\d+:(\\d+)[^\r\n]*[\r\n]+[^\r\n]*SCC\\+2.*", "\\1", vec)
sub("(?s).*?\\+\\d+:(\\d+)(?-s).*\\R.*SCC\\+2(?s).*", "\\1", vec, perl=TRUE)
library(stringr)
str_extract(vec, "(?<=:)\\d+(?=[^\\d\r\n]*[\r\n]+.*SCC\\+2)")
All yield [1] "90"
.
Upvotes: 1