Reputation: 2066
My question should be simple for many of you Supouse I have the following SQL and I want to get the table name using regexp:
SELECT name, age FROM table1
Using this expression I can get that ok
Pattern p = Pattern.compile(".*FROM\\s+(.*?)($|\\s+[WHERE,JOIN,START\\s+WITH,ORDER\\s+BY,GROUP\\s+BY])", Pattern.CASE_INSENSITIVE);
Matcher result = p.matcher(pSql);
if (result.find()) {
lRetorno = result.group(1);
}
But, in case the table name contains the schema name (xyz.table1) my expression brings everything. My question is ... what do I need to modify on this query to only return me the table name without schema/owner?
Any help would be extremely apreciated Regards
Raphael Moita
Upvotes: 2
Views: 9660
Reputation: 24493
Rather than parsing with a regular expression, you might have a better time using an actual parser. See this related question Need java API to parse SQL statements which recommends the Zql library.
It's got examples of the queries it can parse that seem at least as robust as any regex you're going to write.
InputStream is = new ByteArrayInputStream(sqlQueryString.getBytes("UTF-8"));
ZqlParser parser = new ZqlParser(is);
ZQuery query = (ZQuery)parser.readStatement();
Vector tables = query.getFrom();
should do it for you. tables
is a Vector
of ZFromItem
s, and you can do getTable()
on one to get its name without the schema component.
Upvotes: 2
Reputation: 124235
Maybe try it this way
String data1="SELECT name, age FROM table1 whatever";
String data2="SELECT name, age FROM schema.table1 whatever";
Pattern p=Pattern.compile("from\\s+(?:\\w+\\.)*(\\w+)($|\\s+[WHERE,JOIN,START\\s+WITH,ORDER\\s+BY,GROUP\\s+BY])",Pattern.CASE_INSENSITIVE);
//test
Matcher m=p.matcher(data1);
while(m.find())
System.out.println(m.group(1));
m=p.matcher(data2);
while(m.find())
System.out.println(m.group(1));
output:
table1
table1
I just realized that part ($|\\s+[WHERE,JOIN,START\\s+WITH,ORDER\\s+BY,GROUP\\s+BY])
doesn't work as it should because in my input i placed "whatever" after table name and it was found anyway.
It doesn't work like you because you are using [WHERE,JOIN,START\\s+WITH,ORDER\\s+BY,GROUP\\s+BY]
instead of (WHERE|JOIN|START\\s+WITH|ORDER\\s+BY|GROUP\\s+BY)
. For example [abc]
is equal to (a|b|c) so it says regular expression engine to accept any character from that set, not a word abc
. Improve your pattern to something like
Pattern p=Pattern.compile("from\\s+(?:\\w+\\.)*(\\w+)(\\s*$|\\s+(WHERE|JOIN|START\\s+WITH|ORDER\\s+BY|GROUP\\s+BY))",Pattern.CASE_INSENSITIVE);
Upvotes: 5