Reputation: 17879
When I run
/(a)/g.exec('a a a ').length
I get
2
but I thought it should return
3
because there are 3 a
s in the string, not 2!
Why is that?
I want to be able to search for all occurances of a string in RegEx and iterate over them.
FWIW: I'm using node.js
Upvotes: 40
Views: 54555
Reputation: 7677
Here's another solution:
let inputa = `"a" 'b' "c"`
let inputb = `"d" 'e' "f"`
let rx = /["'](.*?)["']/g
function *execAll(expr, input) {
while(true) {
const current = expr.exec(input)
if (!current) {
break
}
yield current
}
}
for (const current of execAll(rx, inputa)) {
console.log(current)
}
console.log(Array.from(execAll(rx, inputb)))
Upvotes: 0
Reputation: 1591
Expanding on @ajaykools's answer, you can just paste the following into a RegExpExtensions.ts
file
declare global {
interface RegExp {
execAll(string: string): RegExpExecArray[];
}
}
RegExp.prototype.execAll = function (
this: RegExp,
string: string,
): RegExpExecArray[] {
const results: RegExpExecArray[] = [];
let result: RegExpExecArray | null;
const regex = new RegExp(this, "g");
while ((result = regex.exec(string))) {
results.push(result);
}
return results;
};
export {};
Then you only need to add import "[...]/RegExpExtensions"
once atop your index.ts
and then you should be able to just do the following anywhere in your code:
const matches = /pattern/.execAll("some string")
Upvotes: 0
Reputation: 685
while loop can help you
x = 'a a a a';
y = new RegExp(/a/g);
while(null != (z=y.exec(x))) {
console.log(z); // output: object
console.log(z[0]); // ouput: "a"
}
If you add counter then you get length of it.
x = 'a a a a';
counter = 0;
y = new RegExp(/a/g);
while(null != (z=y.exec(x))) {
console.log(z); // output: object
console.log(z[0]); // output: "a"
counter++;
}
console.log(counter); // output: 4
This is quite safe, even if it doesn't find any matching then it just exits and counter will be 0
Main intention is to tell how RegExp can be used to loop and get all values from string of same matched RegExp
Upvotes: 28
Reputation: 2674
<string>.matchAll()
into an ArrayIf you want to use a regular expression to match (with capture groups) as an array, and chain Array methods on, all you have to do is spread (...
) <string>.matchAll()
into a []
:
const mapped = [...'a b c'.matchAll(/(?<letter>[abc])/g)]
console.log(mapped)
// Outputs: (3) [Array(2), Array(2), Array(2)]
Now I can do whatever on the destructured capture group(s):
const mapped = [...'a b c'.matchAll(/(?<letter>[abc])/g)]
const onlyAorC = mapped.filter(({groups:{letter}}) => ['a','c'].includes(letter))
console.log(onlyAorC)
// Outputs: (2) [Array(2), Array(2)]
Upvotes: 1
Reputation: 1
If you want to iterate throw a regex without using while
you can use replace
.
Example:
const foo = 'a a a';
foo.replace(/(a)/g, (...regex) => {
// Do something forEach regex found
console.log(regex);
});
Upvotes: 0
Reputation: 11619
Encapsulated into a utility function:
const regexExecAll = (str: string, regex: RegExp) => {
let lastMatch: RegExpExecArray | null;
const matches: RegExpExecArray[] = [];
while ((lastMatch = regex.exec(str))) {
matches.push(lastMatch);
if (!regex.global) break;
}
return matches;
};
Usage:
const matches = regexExecAll("a a a", /(a)/g);
console.log(matches);
Output:
[
[ 'a', 'a', index: 0, input: 'a a a', groups: undefined ],
[ 'a', 'a', index: 2, input: 'a a a', groups: undefined ],
[ 'a', 'a', index: 4, input: 'a a a', groups: undefined ]
]
Upvotes: 0
Reputation: 13698
There are several answers already but unnecessarily complicated. The identity check on the result is excessive because it is always either an array or null
.
let text = `How much wood would a woodchuck chuck if a woodchuck could chuck wood?`;
let re = /wood/g;
let lastMatch;
while (lastMatch = re.exec(text)) {
console.log(lastMatch);
console.log(re.lastIndex);
// Avoid infinite loop
if(!re.global) break;
}
You can move the infinite loop guard into the conditional expression.
while (re.global && (lastMatch = re.exec(text))) {
console.log(lastMatch);
console.log(re.lastIndex);
}
Upvotes: 2
Reputation: 14328
For your example, .match()
is your best option. However, if you do need subgroups, you can make a generator function.
function* execAll(str, regex) {
if (!regex.global) {
console.error('RegExp must have the global flag to retrieve multiple results.');
}
let match;
while (match = regex.exec(str)) {
yield match;
}
}
const matches = execAll('a abbbbb no match ab', /\b(a)(b+)?\b/g);
for (const match of matches) {
console.log(JSON.stringify(match));
let otherProps = {};
for (const [key, value] of Object.entries(match)) {
if (isNaN(Number(key))) {
otherProps[key] = value;
}
}
console.log(otherProps);
}
While most JS programmers consider polluting a prototype to be bad practice, you could also add this to RegExp.prototype
.
if (RegExp.prototype.hasOwnProperty('execAll')) {
console.error('RegExp prototype already includes a value for execAll. Not overwriting it.');
} else {
RegExp.prototype.execAll =
RegExp.prototype = function* execAll(str) {
if (!this.global) {
console.error('RegExp must have the global flag to retrieve multiple results.');
}
let match;
while (match = this.exec(str)) {
yield match;
}
};
}
const matches = /\b(a)(b+)?\b/g.execAll('a abbbbb no match ab');
console.log(Array.from(matches));
Upvotes: 1
Reputation: 17654
regexp.exec(str)
returns the first match or the entire match and the first capture (when re = /(a)/g;
) as mentionned in other answers
const str = 'a a a a a a a a a a a a a';
const re = /a/g;
const result = re.exec(str);
console.log(result);
But it also remembers the position after it in regexp.lastIndex
property.
The next call starts to search from regexp.lastIndex
and returns the next match.
If there are no more matches then regexp.exec
returns null and regexp.lastIndex
is set to 0.
const str = 'a a a';
const re = /a/g;
const a = re.exec(str);
console.log('match : ', a, ' found at : ', re.lastIndex);
const b = re.exec(str);
console.log('match : ', b, ' found at : ', re.lastIndex);
const c = re.exec(str);
console.log('match : ', c, ' found at : ', re.lastIndex);
const d = re.exec(str);
console.log('match : ', d, ' found at : ', re.lastIndex);
const e = re.exec(str);
console.log('match : ', e, ' found at : ', re.lastIndex);
That's why you can use a while loop that will stop when the match is null
const str = 'a a a';
const re = /a/g;
while(match = re.exec(str)){
console.log(match, ' found at : ', match.index);
}
Upvotes: 2
Reputation: 12266
You are only matching the first a. The reason the length is two is that it is finding the first match and the parenthesized group part of the first match. In your case they are the same.
Consider this example.
var a = /b(a)/g.exec('ba ba ba ');
alert(a);
It outputs ba, a
. The array length is still 2, but it is more obvious what is going on. "ba" is the full match. a
is the parenthesized first grouping match.
The MDN documentation supports this - that only the first match and contained groups are returned. To find all matches, you'd use match() as stated by mVChr.
Upvotes: 8
Reputation: 30293
exec()
is returning only the set of captures for the first match, not the set of matches as you expect. So what you're really seeing is $0
(the entire match, "a") and $1
(the first capture)--i.e. an array of length 2. exec()
meanwhile is designed so that you can call it again to get the captures for the next match. From MDN:
If your regular expression uses the "g" flag, you can use the exec method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property (test will also advance the lastIndex property).
Upvotes: 44
Reputation: 50205
You could use match
instead:
'a a a'.match(/(a)/g).length // outputs: 3
Upvotes: 35