Py.rookie89
Py.rookie89

Reputation: 129

Using python to change and run SQL queries

I have the following code that creates a dataframe based on user input:

import pandas as pd
  from pandas import DataFrame

 publications = 
 pd.read_csv("C:/Users/nkambhal/data/pubmed_search_results_180730.csv", sep= 
 "|")

 publications['title'] = publications['title'].fillna('')

 search_term = input('Enter the term you are looking for: ')

 publications[['title','publication_id']] 
 [publications['title'].str.contains(search_term)]
 title_mask = 
 publications.title.str.lower().str.contains(search_term.lower())
 new = publications.loc[title_mask, ['title', 'publication_ID']]

Now I want to use the publication ID's in the new dataframe to run this SQL query:

SELECT
   author_profile
   pub_lst.* 
FROM
   pub_lst
JOIN
    author_profile
        ON pub_lst.author_id = author_profile.author_id
WHERE
    pub_lst.publication_id IN (67855,65559);

In the where statement, I want the IDs in the new dataframe to be there. So in the data frame there are the publication_ids ( 5, 6, 4) then I want them to be added to the query.

How can I add the appropriate publication_ids to the SQL query and run it through python and save it to a csv file?

Upvotes: 3

Views: 1760

Answers (1)

BoreBoar
BoreBoar

Reputation: 2739

To put data into a string, you can use the python's str.format function. You can read about it a little more here

For your query string, it should work out like so:

query_string = """
SELECT
   author_profile
   pub_lst.* 
FROM
   pub_lst
JOIN
    author_profile
        ON pub_lst.author_id = author_profile.author_id
WHERE
    pub_lst.publication_id IN {};
"""
print(query_string.format(str(tuple(new.publication_ID.values))))

As for the running the query, you will need to use a python module for whichever database you want to connect it. Such as PyMySQL for connecting to a MySQL database. https://pypi.org/project/PyMySQL/

Although, you could use an ORM such as peewee or SqlAlchemy to make your life a little easier while dealing with SQL databases. Pandas and SqlAlchemy mix really well. But Peewee is easier to get started with.

For creating a csv, you could use the inbuild python csv module, pandas or Peewee or SqlAlchemy in ascending order of difficulty.

Upvotes: 2

Related Questions