Miles Abarr
Miles Abarr

Reputation: 25

Find parent with certain combination of child rows - SQLite with Python

There are several parts to this question. I am working with sqlite3 in Python 2.7, but I am less concerned with the exact syntax, and more with the methods I need to use. I think the best way to ask this question is to describe my current database design, and what I am trying to accomplish. I am new to databases in general, so I apologize if I don't always use correct nomenclature.

I am modeling refrigeration systems (using Modelica--not really important to know), and I am using the database to manage input data, results data, and models used for that data.

My top parent table is Model, which contains the columns:

id, name, version, date_created

My child table under Model is called Design. It is used to create a unique id for each combination of design input parameters and the model used. the columns it contains are:

id, model_id, date_created

I then have two child tables under Design, one called Input, and the other called Result. We can just look at Input for now, since one example should be enough. The columns for input are:

id, value, design_id, parameter_id, component_id

parameter_id and component_id are foreign keys to their own tables.The Parameter table has the following columns:

id, name, units 

Some example rows for Parameter under name are: length, width, speed, temperature, pressure (there are many dozens more). The Component table has the following columns:

id, name

Some example rows for Component under name are: compressor, heat_exchanger, valve.

Ultimately, in my program I want to search the database for a specific design. I want to be able to search a specific design to be able to grab specific results for that design, or to know whether or not a model simulation with that design has already been run previously, to avoid re-running the same data point.

I also want to be able to grab all the parameters for a given design, and insert it into a class I have created in Python, which is then used to provide inputs to my models. In case it helps for solving the problem, the classes I have created are based on the components. So, for example, I have a compressor class, with attributes like compressor.speed, compressor.stroke, compressor.piston_size. Each of these attributes should have their own row in the Parameter table.

So, how would I query this database efficiently to find if there is a design that matches a long list (let's assume 100+) of parameters with specific values? Just as a side note, my friend helped me design this database. He knows databases, but not my application super well. It is possible that I designed it poorly for what I want to accomplish.

Here is a simple picture trying to map a certain combination of parameters with certain values to a design_id, where I have taken out component_id for simplicity:

Picture of simplified tables

Upvotes: 1

Views: 1180

Answers (1)

Parfait
Parfait

Reputation: 107737

Simply join the necessary tables. Your schema properly reflects normalization (separating tables into logical groupings) and can scale for one-to-many relationships. Specifically, to answer your question --So, how would I query this database efficiently to find if there is a design that matches a long list (let's assume 100+) of parameters with specific values?-- consider below approaches:

Inner Join with Where Clause

For handful of parameters, use an inner join with a WHERE...IN() clause. Below returns design fields joined by input and parameters tables, filtered for specific parameter names where you can have Python pass as parameterized values even iteratively in a loop:

SELECT d.id, d.model_id, d.date_created
FROM design d
INNER JOIN input i ON d.id = i.design_id
INNER JOIN parameters p ON p.id = i.parameter_id
WHERE p.name IN ('param1', 'param2', 'param3', 'param4', 'param5', ...)

Inner Join with Temp Table

Should values be over 100+ in a long list, consider a temp table that filters parameters table to specific parameter values:

# CREATE EMPTY TABLE (SAME STRUCTURE AS parameters)
sql = "CREATE TABLE tempparams AS SELECT id, name, units FROM parameters WHERE 0;"
cur.execute(sql)
db.commit()

# ITERATIVELY APPEND TO TEMP
for i in paramslist:                                  # LIST OF 100+ ITEMS
    sql = "INSERT INTO tempparams (id, name, units) \
           SELECT p.id, p.name, p.units \
           FROM parameters p \
           WHERE p.name = ?;"
    cur.execute(sql, i)                               # CURSOR OBJECT COMMAND PASSING PARAM
    db.commit()                                       # DB OBJECT COMMIT ACTION 

Then, join main design and input tables with new temp table holding specific parameters:

SELECT d.id, d.model_id, d.date_created
FROM design d
INNER JOIN input i ON d.id = i.design_id
INNER JOIN tempparams t ON t.id = i.parameter_id 

Same process can work with components table as well.

*Moved picture to question section

Upvotes: 2

Related Questions