Running Dynamic Query From Python with input from CSV

Question

I've a CSV file with table names and primary keys for those tables in below format:

|  Table Name  |  Primary Key  | 
|    Table 1   |     Col1      |  
|    Table 1   |     Col2      |
|    Table 1   |     Col3      | 
|    Table 2   |     Col11     | 
|    Table 2   |     Col12     |

I want to run a sql query to validate PK constraint for every table. The query to do it would look like this:

select Col1, Col2, Col3 from Table1
group by Col1, Col2, Col3
having count(*)>1

But I've thousands of table in this file. How would I write and execute this query dynamically and write results into a flat file? I want to execute this using Python 3.

Attempt:

CSV:

My PKTest.py

def getColumns(filename):
    tables = {}

    with open(filename) as f:
        for line in f:
            line = line.strip()
            if 'Primary Key' in line:
                continue

            cols = line.split('|')
            table = cols[1].strip()
            col = cols[2].strip()

            if table in tables:
                tables[table].append(col)
            else:
                tables[table] = [col]
    return tables

def runSQL(table, columns):
    statement = 'select {0} from {1} group by {0} having count(*) > 1'.format(', '.join(columns), table.replace(' ',''))
    return statement

if __name__ == '__main__':
    tables = getColumuns('PKTest.csv')
    try:
        #cursor to connect

        for table in tables:
            sql = runSQL(table,tables[table])
            print(sql)
            cursor.execute(sql)
            for result in cursor:
                print(result)

    finally:
        cursor.close()
    ctx.close()

zedfoxus · Accepted Answer

You will have to improvise on this answer a bit since I do not have access to Oracle.

Let's assume there's a file called so.csv that contains the data as shown in your question.

Create a file called so.py like so. I'll add bits of code and some explanation. You can piece the file together or copy/paste it from here: https://rextester.com/JLQ73751.

At the top of the file, import your Oracle dependency:

# import cx_Oracle
# https://www.oracle.com/technetwork/articles/dsl/python-091105.html

Then, create a function that parses your so.csv and puts table and columns in a dictionary like this: {'Table 1': ['Col1', 'Col2', 'Col3'], 'Table 2': ['Col11', 'Col12']}

def get_tables_columns(filename):

    tables = {}

    with open(filename) as f:
        for line in f:
            line = line.strip()
            if 'Primary Key' in line:
                continue

            cols = line.split('|')

            table = cols[1].strip()
            col = cols[2].strip()

            if table in tables:
                tables[table].append(col)
            else:
                tables[table] = [col]

    return tables

Then, create a function that generates sql if it knows the table and list of columns:

def get_sql(table, columns):

    statement = 'select {0} from {1} group by {0} having count(*) > 1'.format(
            ', '.join(columns),
            table.replace(' ', '')
        )

    return statement

It's time to execute the functions:

if __name__ == '__main__':
    tables = get_tables_columns('so.csv')

    # here goes your code to connect with Oracle
    # con = cx_Oracle.connect('pythonhol/welcome@127.0.0.1/orcl')
    # cur = con.cursor()

    for table in tables:
        sql = get_sql(table, tables[table])
        print(sql)

        # here goes your sql statement execution            
        # cur.execute(sql)
        # for result in cur:
        #    print result

    # close your Oracle connection
    # con.close()

You can include your Oracle-related statements and run the python file.

Running Dynamic Query From Python with input from CSV

Answers (1)

Related Questions