Reputation: 1088
I have been provided with two example input strings:
"Russia has entered the WWII in [A] [B] after german invasion"
"Russia has entered the WWII in September 1941 after german invasion"
There can be any characters before, after and between the [A] and [B] in the first string and there could be additional placeholders e.g. [C] [D] etc. Each placeholder can only occur once.
How can I use regex to match "September" and "1941"?
I need to match each placeholder in a single regex, not multiple steps.
My thoughts at a solution
I'm guessing the solution will be something like:
'Match everything in string 2 after everything before [A] in string 1 and before everything after [A] in string 1'.
I figured out (.*(:?\[A\]))
and ((:?\[A\]).*)
to get the text before and after the [A] in the first string, but can't figure out how to use that to look at the second string. Perhaps I need to concatenate the two things with some sort of delimiter and look at either side of the delimiter?
Upvotes: 1
Views: 2956
Reputation: 16089
If I understood your question correctly, you would like match the fragments around [A]
and [B]
to search in the second term for their respective values. You can do this in two steps. First, you need to extract the terms around the [A]
and [B]
. This can be done with the following regular expression: ^(.*?)(\[A\])(.*?)(\[B\])(.*?)$
. In a second step, you need to create a new regular expression out of the result of the first one. The three matched groups (the values in the round brackets form a group) would then be the fragments around the terms [A]
and [B]
. You then need to create a new regular expression out of those three fragments. Here, the implementation differs for every programming language. In JavaScript, the matching object can be used to create a new regular expression like this: new RegExp(matches1[1] + '(.*?)' + matches1[2] + '(.*?)' + matches1[3])
. Finally, you end up with the match of the two values.
Here, the example is implemented in JavaScript:
var text1 = "Russia has entered the WWII in [A] [B] after german invasion";
var regex1 = new RegExp(/^(.*?)\[A\](.*?)\[B\](.*?)$/);
var matches1 = text1.match(regex1);
var text2 = "Russia has entered the WWII in September 1941 after german invasion";
var regex2 = new RegExp(matches1[1] + '(.*?)' + matches1[2] + '(.*?)' + matches1[3]);
var matches2 = text2.match(regex2);
console.log(matches2[1]);
console.log(matches2[2]);
Upvotes: 1