ankitpandey
ankitpandey

Reputation: 359

Regex to get values in SQL

I am trying to read values inside key values of SQL i.e. UPDATE, SET, FROM and WHERE. Using regex, i can get values until first line after where clause, but unable to get complete where clause. Please suggest.

Sample SQL from which value needs to be fetched.

 UPDATE dbname.tablename  
    SET
    alias1=T1.col1,
    alias2=T1.col2
    FROM datafabric_cfd_dea.T1 T1 
    WHERE
    tablename.keycol =T1.keycol
    AND tablename.col3='ABC'
    AND T1.col3='ABC'
    AND tablename.col3=T1.col3
    AND T1.col2='XYZ';

Regex:

UPDATE\s*(.*)\s*SET\s*(.*)\s*FROM\s*(.*)\s*WHERE\s*(.*)

Please suggest.

Upvotes: 0

Views: 212

Answers (1)

alecxe
alecxe

Reputation: 474221

I don't think regular expressions are the most appropriate tool for the job. As we are operating in the Python ecosystem we are lucky to have the luxury of using the power of the huge set of third-party packages on PyPI. There is a popular package called sqlparse which implements an SQL parser.

You may parse your SQL statement and analyze the parsed tokens:

In [1]: import sqlparse

In [2]: sql = """ UPDATE dbname.tablename  
   ...:     SET
   ...:     alias1=T1.col1,
   ...:     alias2=T1.col2
   ...:     FROM datafabric_cfd_dea.T1 T1 
   ...:     WHERE
   ...:     tablename.keycol =T1.keycol
   ...:     AND tablename.col3='ABC'
   ...:     AND T1.col3='ABC'
   ...:     AND tablename.col3=T1.col3
   ...:     AND T1.col2='XYZ';"""

In [3]: parsed_sql = sqlparse.parse(sql)

# get the where clause
In [4]: where_clause = next(token for token in parsed_sql[0].tokens 
                            if isinstance(token, sqlparse.sql.Where))

In [5]: where_clause.tokens
Out[5]: 
[<Keyword 'WHERE' at 0x10FCD1A78>,
 <Newline ' ' at 0x10FCD1668>,
 <Whitespace ' ' at 0x10FCD1B48>,
 <Comparison 'tablen...' at 0x10FCCDB50>,
 <Newline ' ' at 0x10FCD1AE0>,
 ...
 <Comparison 'T1.col...' at 0x10FCCDD50>,
 <Punctuation ';' at 0x10FCF5EF0>]

Upvotes: 1

Related Questions