Reputation: 25
There are several parts to this question. I am working with sqlite3 in Python 2.7, but I am less concerned with the exact syntax, and more with the methods I need to use. I think the best way to ask this question is to describe my current database design, and what I am trying to accomplish. I am new to databases in general, so I apologize if I don't always use correct nomenclature.
I am modeling refrigeration systems (using Modelica--not really important to know), and I am using the database to manage input data, results data, and models used for that data.
My top parent table is Model
, which contains the columns:
id, name, version, date_created
My child table under Model
is called Design
. It is used to create a unique id for each combination of design input parameters and the model used. the columns it contains are:
id, model_id, date_created
I then have two child tables under Design
, one called Input
, and the other called Result
. We can just look at Input for now, since one example should be enough. The columns for input are:
id, value, design_id, parameter_id, component_id
parameter_id
and component_id
are foreign keys to their own tables.The Parameter
table has the following columns:
id, name, units
Some example rows for Parameter under
name are: length, width, speed, temperature, pressure (there are many dozens more). The Component table has the following columns:
id, name
Some example rows for Component
under name are: compressor, heat_exchanger, valve.
Ultimately, in my program I want to search the database for a specific design. I want to be able to search a specific design to be able to grab specific results for that design, or to know whether or not a model simulation with that design has already been run previously, to avoid re-running the same data point.
I also want to be able to grab all the parameters for a given design, and insert it into a class I have created in Python, which is then used to provide inputs to my models. In case it helps for solving the problem, the classes I have created are based on the components. So, for example, I have a compressor class, with attributes like compressor.speed, compressor.stroke, compressor.piston_size. Each of these attributes should have their own row in the Parameter table.
So, how would I query this database efficiently to find if there is a design that matches a long list (let's assume 100+) of parameters with specific values? Just as a side note, my friend helped me design this database. He knows databases, but not my application super well. It is possible that I designed it poorly for what I want to accomplish.
Here is a simple picture trying to map a certain combination of parameters with certain values to a design_id, where I have taken out component_id for simplicity:
Upvotes: 1
Views: 1180
Reputation: 107737
Simply join the necessary tables. Your schema properly reflects normalization (separating tables into logical groupings) and can scale for one-to-many relationships. Specifically, to answer your question --So, how would I query this database efficiently to find if there is a design that matches a long list (let's assume 100+) of parameters with specific values?-- consider below approaches:
Inner Join with Where Clause
For handful of parameters, use an inner join with a WHERE...IN()
clause. Below returns design fields joined by input and parameters tables, filtered for specific parameter names where you can have Python pass as parameterized values even iteratively in a loop:
SELECT d.id, d.model_id, d.date_created
FROM design d
INNER JOIN input i ON d.id = i.design_id
INNER JOIN parameters p ON p.id = i.parameter_id
WHERE p.name IN ('param1', 'param2', 'param3', 'param4', 'param5', ...)
Inner Join with Temp Table
Should values be over 100+ in a long list, consider a temp table that filters parameters table to specific parameter values:
# CREATE EMPTY TABLE (SAME STRUCTURE AS parameters)
sql = "CREATE TABLE tempparams AS SELECT id, name, units FROM parameters WHERE 0;"
cur.execute(sql)
db.commit()
# ITERATIVELY APPEND TO TEMP
for i in paramslist: # LIST OF 100+ ITEMS
sql = "INSERT INTO tempparams (id, name, units) \
SELECT p.id, p.name, p.units \
FROM parameters p \
WHERE p.name = ?;"
cur.execute(sql, i) # CURSOR OBJECT COMMAND PASSING PARAM
db.commit() # DB OBJECT COMMIT ACTION
Then, join main design and input tables with new temp table holding specific parameters:
SELECT d.id, d.model_id, d.date_created
FROM design d
INNER JOIN input i ON d.id = i.design_id
INNER JOIN tempparams t ON t.id = i.parameter_id
Same process can work with components table as well.
*Moved picture to question section
Upvotes: 2