Reputation: 693
I have a pd df that looks like this:
tab_name | col_name |
---|---|
CV_TAB1 | TAB1_COL1 |
CV_TAB1 | TAB1_COL2 |
CV_TAB1 | TAB1_COL3 |
CV_TAB2 | TAB2_COL1 |
CV_TAB2 | TAB2_COL2 |
what I need is a python script that generates as many SQL scripts as are the different tab_names WITHOUT the CV_ prefix (in the example, 2 scripts) that create a view for every table with associated columns WITHOUT the TABX_ prefix. So:
'CREATE WIEV TAB1 AS (
SELECT
TAB1_COL1 AS COL1,
TAB1_COL2 AS COL2,
TAB1_COL3 AS COL3
FROM CV_TAB1);'
And another one for TAB2.
Note that in the FROM clause the table name is WITH the CV_.
I've written the code that creates the script:
script = ""
i = 0
list_of_scripts = list()
for t in list_of_table_names:
if t['name'][:3] == 'CV_':
table_name = t['name'][3:] # Removing CV_ from table name
script += "DROP VIEW IF EXISTS " + str(table_name) + ";\n\n" + "CREATE VIEW "+ str(table_name) + " AS ( " + "\n" + "SELECT\n" # First part of SQL script
for c in list_of_columns_out_of_pd_df:
if c['columnName'][:len(table_name)] == str(table_name): # Check for the TABX_ prefix
column_name = c['columnName'][len(table_name)+1:]
if i == 0:
script += str(c['columnName']) + " AS " + str(column_name) + "\n"
i+=1
else:
script += "," + str(c['columnName']) + " AS " + str(column_name) + "\n"
else:
if i == 0:
script += str(c['columnName']) + " AS " + str(c['columnName']) + "\n"
i+=1
else:
script += "," + str(c['columnName']) + " AS " + str(c['columnName']) + "\n"
i = 0
script += "FROM " + str(t['name']) + ");"
list_of_scripts.append(script)
script = ""
what I am missing are the list_of_table_names and the list_of_columns_out_of_pd_df variables
Thanks!
Upvotes: 1
Views: 60
Reputation: 29635
Here is a way using groupby
.
l = []
for tb, cols in df.groupby('tab_name')['col_name']:
# aggregate the cols with the wanted format
_s = ',\n'.join([f'{col} as {col.split("_")[-1]}' for col in cols])
# create the full query
s = f'''CREATE VIEW {tb.split("_")[-1]} AS (\nSELECT\n{_s}\nFROM {tb});'''
l.append(s)
print(l[0])
# CREATE VIEW TAB1 AS (
# SELECT
# TAB1_COL1 as COL1,
# TAB1_COL2 as COL2,
# TAB1_COL3 as COL3
# FROM CV_TAB1);
print(l[1])
# CREATE VIEW TAB2 AS (
# SELECT
# TAB2_COL1 as COL1,
# TAB2_COL2 as COL2
# FROM CV_TAB2);
Upvotes: 3