Reputation:
I think the title says it all. I'm trying to get groups and concatenate them together.
I have this text:
GPX 10.802.123/3843 1 - IDENTIFIER 48
And I want this output:
IDENTIFIER 10.802.123/3843-48
So I want to explicitly say, I want to capture one group before this word and after, then concatenate both, only using regex. Is this possible?
I can already extract the 48
like this:
var text = GPX 10.802.123/3843 1 - IDENTIFIER 48
var reg = new RegExp('IDENTIFIER' + '.*?(\\d\\S*)', 'i');
var match = reg.exec(text);
Output:
48
Can it be done?
I'm offering 200 points.
Upvotes: 4
Views: 125
Reputation: 89547
You can use split too:
var text = 'GPX 10.802.123/3843 1 - IDENTIFIER 48';
var parts = text.split(/\s+/);
if (parts[4] == 'IDENTIFIER') {
var result = parts[4] + ' ' + parts[1] + '-' + parts[5];
console.log(result);
}
Upvotes: 0
Reputation: 67968
^\s*\S+\s*\b(\d+(?:[./]\d+)+)\b.*?-.*?\b(\S+)\b\s*(\d+)\s*$
You can try this.Replace by $2 $1-$3
.See demo.
https://regex101.com/r/sS2dM8/38
var re = /^\s*\S+\s*\b(\d+(?:[.\/]\d+)+)\b.*?-.*?\b(\S+)\b\s*(\d+)\s*$/gm;
var str = 'GPX 10.802.123/3843 1 - IDENTIFIER 48';
var subst = '$2 $1-$3';
var result = str.replace(re, subst);
Upvotes: 1
Reputation: 12239
You must precisely define the groups that you want to extract before and after the word. If you define the group before the word as four or more non-whitespace characters, and the group after the word as one or more non-whitespace characters, you can use the following regular expression.
var re = new RegExp('(\\S{4,})\\s+(?:\\S{1,3}\\s+)*?' + word + '.*?(\\S+)', 'i');
var groups = re.exec(text);
if (groups !== null) {
var result = groups[1] + groups[2];
}
Let me break down the regular expression. Note that we have to escape the backslashes because we're writing a regular expression inside a string.
(\\S{4,})
captures a group of four or more non-whitespace characters\\s+
matches one or more whitespace characters(?:
indicates the start of a non-capturing group\\S{1,3}
matches one to three non-whitespace characters\\s+
matches one or more whitespace characters)*?
makes the non-capturing group match zero or more times, as few times as possibleword
matches whatever was in the variable word
when the regular expression was compiled.*?
matches any character zero or more times, as few times as possible(\\S+)
captures one or more non-whitespace characters'i'
flag makes this a case-insensitive regular expressionObserve that our use of the ?
modifier allows us to capture the nearest groups before and after the word.
You can match the regular expression globally in the text by adding the g
flag. The snippet below demonstrates how to extract all matches.
function forward_and_backward(word, text) {
var re = new RegExp('(\\S{4,})\\s+(?:\\S{1,3}\\s+)*?' + word + '.*?(\\S+)', 'ig');
// Find all matches and make an array of results.
var results = [];
while (true) {
var groups = re.exec(text);
if (groups === null) {
return results;
}
var result = groups[1] + groups[2];
results.push(result);
}
}
var sampleText = " GPX 10.802.123/3843- 1 -- IDENTIFIER 48 A BC 444.2345.1.1/99x 28 - - Identifier 580 X Y Z 9.22.16.1043/73+ 0 *** identifier 6800";
results = forward_and_backward('IDENTIFIER', sampleText);
for (var i = 0; i < results.length; ++i) {
document.write('result ' + i + ': "' + results[i] + '"<br><br>');
}
body {
font-family: monospace;
}
Upvotes: 3
Reputation: 174696
This would be possible through replace function.
var s = 'GPX 10.802.123/3843 1 - IDENTIFIER 48'
s.replace(/.*?(\S+)\s+\d+\s*-\s*(IDENTIFIER)\s*(\d+).*/, "$2 $1-$3")
Upvotes: 1
Reputation: 784958
You can do:
var text = 'GPX 10.802.123/3843 1 - IDENTIFIER 48';
var match = /GPX\s+(.+?) \d .*?(IDENTIFIER).*?(\d\S*)/i.exec(text);
var output = match[2] + ' ' + match[1] + '-' + match[3];
//=> "IDENTIFIER 10.802.123/3843-48"
Upvotes: 3