Reputation: 1056
I been struggling to find a Regex that help me match 3 different strings only if they aren't inside parentheses, but so far I have only managed to match it if it's right next to the parentheses, and in this specific situation it doesn't suit me.
To clarify I need to match the Strings "HAVING", "ORDER BY" and "GROUP BY" that aren't contained in any parentheses, no matter if the parentheses contains more than just the string.
In that case:
Select *
from some_table
group by something;
Should match, but:
Select *
from(
Select *
from some_other_table
group by something_else
)
or
Select this, and_this
from(
Select *
from some_other_table
having some_condition
)
shouldn't.
I'm not an expert in Javascript Regex, so any help you could give me would be greatly appreciated.
Upvotes: 0
Views: 287
Reputation: 85767
I assume you want to check whether a given SQL query contains HAVING
, ORDER BY
or GROUP BY
at the top level (not within a subquery).
This is complicated by the fact that both parens and words can be contained inside of string literals ('...'
), quoted identifiers ("..."
), and comments (-- ...
).
In the following code I assume that that's all that can "go wrong" (i.e. there are no other quoting constructs) and that no quoted characters are special (in particular, \
isn't treated any differently).
Idea:
And by "remove" I mean "replace by a space" because otherwise there is the possibility of new tokens being created where there were none before (e.g. hav(...)IN"asdf"g
would turn into havINg
if parenthesized/quoted parts were just replaced by nothing).
Implementation:
function contains_groupy_bits(sql) {
sql = sql.replace(/'[^']*'|"[^"]*"|--[^\n]*/g, ' ');
let tmp;
while ((tmp = sql.replace(/\([^()]*\)/g, ' ')) !== sql) {
sql = tmp;
}
return /\b(?:having|order\s+by|group\s+by)\b/i.test(sql);
}
const examples = [
`Select *
from some_table
group by something;`,
`Select *
from(
Select *
from some_other_table
group by something_else
)`,
`Select this, and_this
from(
Select *
from some_other_table
having some_condition
)`,
`select name, count(*) from things
where mark = '('
group by name -- )`,
];
for (const ex of examples) {
console.log("'" + ex + "': " + contains_groupy_bits(ex));
}
Upvotes: 1