Effectively adding new columns from sqlite db to pandas dataframe

Question

I am learning more and more about python, moving from less efficient looping over lists of lists and have been learning numpy/pandas.

Let's say I have a pandas data frame df, for example something that looks like:

| id | Year | Value |
---------------------
| 1  | 2015 | 1.00  |
| 1  | 2014 | 0.80  |
| 2  | 2015 | 1.10  |
| 3  | 2015 | 0.6   |

etc

Now how would I effectively in python, query my DB based on the values in a specific column and then append the queried values to new columns. Right now my code would look something like this

uIDs = set(df['id'])
uIDs = ",".join([str(x) for x in uIDs])

with dbconnection as con:
    data = con.execute("SELECT id, value2 FROM table WHERE ids IN ("+uIDs")")

for i in data.fetchall():
    df[df['id'] == i[0]]['newcol'] = i[1]

Sorry for any errors in the code, I was typing off my head to give an example.

As you see I am trying to pull all unique IDs from the pandas data frame, select the new values from the database and then iteratively loop through them to append to a new column in the data frame.

Is there a more effective/efficient/pythonic way to do this? If you are looking at more than just 3 but 3,000 or 30,000 rows is this the best way to approach it?

chrisaycock · Accepted Answer

Since you want an SQL-style left join, just use

pd.merge(df, data.fetchall(), on='id', how='left')

Effectively adding new columns from sqlite db to pandas dataframe

Answers (1)

Related Questions