Assign several DataFrame columns to match SQL table

Question

I have several DataFrames that need to flip to SQL tables. The SQL tables all share one schema yet the DataFrames do not. I need to be able to easily match/change the df columns to the sql table. Everything I have seen on here is manipulating 1 or 2 fields using df.to_sql. I need to be able to manipulate at least 10 fields as easy as I do with lists. Below are example tables

list1
+-------+-------+-------+-------+  
| name  |hobby1 |hobby2 |hobby3 |  
+-------+-------+-------+-------+   
| kris  | ball  | swim  | dance |  
| james | eat   | sing  | sleep |  
| amy   | swim  | eat   | watch |  
+-------+-------+-------+-------+ 

df2
+---------+------------+-----------+-----------+  
| df2name  | df2hobby1 | df2hobby2 |df2hobby3 |  
+---------+------------+-----------+-----------+   
| kris     | ball      | swim      | dance    |  
| james    | eat       | sing      | sleep    |  
| amy      | swim      | eat       | watch    |  
+----------+-----------+-----------+-----------+ 

sql1
+-----------+-----------+-----------+-----------+  
| sql_name  |sql_hobby1 |sql_hobby2 |sql_hobby3 |  
+-----------+-----------+-----------+-----------+   
| kris      | ball      | swim      | dance     |  
| james     | eat       | sing      | sleep     |  
| amy       | swim      | eat       | watch     |  
+----------+-----------+------------+------------+

Sometimes I receive the data in a python dict, I can easily transfer using a kwargs function and works great. My function is below:

def transfer_dict(**kwargs):
    transfer = {'sqlname':' ',
               'sqlhobby1' : ' ',
               'sqlhobby2' : ' ',
               'sqlhobby3' : ' '
               }
     transfer.update(kwargs)
     return (transfer)

I transfer easily by doing:

new_list.append(transfer_dict(sqlname=name, sqlhobby1=hobby1, sqlhobby2=hobby2, sqlhobby3=hobby3))

Can I use my same kwargs transfer function to apply on DataFrame transfers to SQL? Or is there a better way?

Matt L. · Accepted Answer

The pandas.DataFrame.rename() method will accept a dict-like set of column names and names to rename them with. In many cases, the fastest solution to the problem you are describing (if I'm understanding you correctly) is to use a combination of rename() and drop() to change the source DataFrame so that it matches the SQL target, and then use to_sql() as you have described doing (but now, critically, all the column names match their intended targets). For example:

sql_mappings = {'df2_name':'sql_name', 'df2_hobby1':'sql_hobby1', 'df2_hobby2':'sql_hobby2', 'df2_hobby3':'sql_hobby3'}

sql_columns = [i for i in sql_mappings.values()]

df2 = df2.rename(columns=sql_mappings)
df2 = df2.drop(columns=[col for col in df2 if col not in sql_columns ])

If you want to set things like the sql table name and execute to_sql dynamically, I can imagine a fairly straightforward wrapper function that does both tasks using this approach.

Assign several DataFrame columns to match SQL table

Answers (1)

Related Questions