techhighlands
techhighlands

Reputation: 972

Put retrieved data from MySQL query into DataFrame pandas by a for loop

I have one database with two tables, both have a column called barcode, the aim is to retrieve barcode from one table and search for the entries in the other where extra information of that certain barcode is stored. I would like to have bothe retrieved data to be saved in a DataFrame. The problem is when I want to insert the retrieved data into DataFrame from the second query, it stores only the last entry:

 import mysql.connector
 import pandas as pd
 cnx = mysql.connector(user,password,host,database)
 query_barcode = ("SELECT barcode FROM barcode_store")
 cursor = cnx.cursor()
 cursor.execute(query_barcode)
 data_barcode = cursor.fetchall()

Up to this point everything works smoothly, and here is the part with problem:

 query_info = ("SELECT product_code FROM product_info WHERE barcode=%s")
 for each_barcode in data_barcode:
      cursor.execute(query_info % each_barcode)
      pro_info = pd.DataFrame(cursor.fetchall())

pro_info contains only the last matching barcode information! While I want to retrieve all the information for each data_barcode match.

Upvotes: 1

Views: 3118

Answers (1)

Nhor
Nhor

Reputation: 3940

That's because you are consistently overriding existing pro_info with new data in each loop iteration. You should rather do something like:

query_info = ("SELECT product_code FROM product_info")
cursor.execute(query_info)
pro_info = pd.DataFrame(cursor.fetchall())

Making so many SELECTs is redundant since you can get all records in one SELECT and instantly insert them to your DataFrame.

@edit: However if you need to use the WHERE statement to fetch only specific products, you need to store records in a list until you insert them to DataFrame. So your code will eventually look like:

pro_list = []
query_info = ("SELECT product_code FROM product_info WHERE barcode=%s")
for each_barcode in data_barcode:
     cursor.execute(query_info % each_barcode)
     pro_list.append(cursor.fetchone())
pro_info = pd.DataFrame(pro_list)

Cheers!

Upvotes: 1

Related Questions