Reputation: 844
I've a CSV file with table names and primary keys for those tables in below format:
| Table Name | Primary Key | | Table 1 | Col1 | | Table 1 | Col2 | | Table 1 | Col3 | | Table 2 | Col11 | | Table 2 | Col12 |
I want to run a sql query to validate PK constraint for every table. The query to do it would look like this:
select Col1, Col2, Col3 from Table1
group by Col1, Col2, Col3
having count(*)>1
But I've thousands of table in this file. How would I write and execute this query dynamically and write results into a flat file? I want to execute this using Python 3.
Attempt:
CSV:
My PKTest.py
def getColumns(filename):
tables = {}
with open(filename) as f:
for line in f:
line = line.strip()
if 'Primary Key' in line:
continue
cols = line.split('|')
table = cols[1].strip()
col = cols[2].strip()
if table in tables:
tables[table].append(col)
else:
tables[table] = [col]
return tables
def runSQL(table, columns):
statement = 'select {0} from {1} group by {0} having count(*) > 1'.format(', '.join(columns), table.replace(' ',''))
return statement
if __name__ == '__main__':
tables = getColumuns('PKTest.csv')
try:
#cursor to connect
for table in tables:
sql = runSQL(table,tables[table])
print(sql)
cursor.execute(sql)
for result in cursor:
print(result)
finally:
cursor.close()
ctx.close()
Upvotes: 3
Views: 1250
Reputation: 37089
You will have to improvise on this answer a bit since I do not have access to Oracle.
Let's assume there's a file called so.csv
that contains the data as shown in your question.
Create a file called so.py
like so. I'll add bits of code and some explanation. You can piece the file together or copy/paste it from here: https://rextester.com/JLQ73751.
At the top of the file, import your Oracle dependency:
# import cx_Oracle
# https://www.oracle.com/technetwork/articles/dsl/python-091105.html
Then, create a function that parses your so.csv
and puts table and columns in a dictionary like this: {'Table 1': ['Col1', 'Col2', 'Col3'], 'Table 2': ['Col11', 'Col12']}
def get_tables_columns(filename):
tables = {}
with open(filename) as f:
for line in f:
line = line.strip()
if 'Primary Key' in line:
continue
cols = line.split('|')
table = cols[1].strip()
col = cols[2].strip()
if table in tables:
tables[table].append(col)
else:
tables[table] = [col]
return tables
Then, create a function that generates sql if it knows the table and list of columns:
def get_sql(table, columns):
statement = 'select {0} from {1} group by {0} having count(*) > 1'.format(
', '.join(columns),
table.replace(' ', '')
)
return statement
It's time to execute the functions:
if __name__ == '__main__':
tables = get_tables_columns('so.csv')
# here goes your code to connect with Oracle
# con = cx_Oracle.connect('pythonhol/[email protected]/orcl')
# cur = con.cursor()
for table in tables:
sql = get_sql(table, tables[table])
print(sql)
# here goes your sql statement execution
# cur.execute(sql)
# for result in cur:
# print result
# close your Oracle connection
# con.close()
You can include your Oracle-related statements and run the python file.
Upvotes: 2