dotNes
dotNes

Reputation: 85

SQL efficiency rows or tables

I'm creating a database that holds yield values of electric engines. The yield values are stored in an Excel file which I have to transfer to the database. Each test for an engine has 42 rows (torque) and 42 columns (power in kw) with the values stored in these cells.

(kw)         1,0        1,2   ...(42x)
           --------   -------
(rpm)2000     76,2      77,0
     2100     76,7      77,6
      ...
     (42x)

Well I thought of creating a column for engine_id, test_id (each engine can have more than one test), and 42 columns for the corresponding yield values. For each test I have to add 42 rows for one single engine with the yield values. This doesn't seem efficient nor easy to implement to me.

If there are 42 records (rows) for 1 single engine, in a matter of time the database will hold up several thousands of rows and searching for a specific engine with the corresponding values will be an exhausting task.

If I make for each test for a specific engine a separate table, again after some time I would I have probably thousands of tables. Now what should I go for, a table with thousands of records or a table with 42 columns and 42 rows? Either way, I still have redundant records.

Upvotes: 1

Views: 122

Answers (2)

Mike Hanrahan
Mike Hanrahan

Reputation: 1192

A database is definitely the answer (searching through many millions, or hundred of millions of rows is pretty easy once you get the hang of SQL (the language for interacting with databases). I would recommend a table structure of

EngineId, TestId, TourqueId, PowerId, YieldValue

Which would have values...

Engine1, Test1, 2000, 1.0, 73.2

So only 5 columns. This will give you the flexibility to add more yield results in future should it be required (or even if its not, its just an easier schema anyway). You will need to learn SQL, however, to realise the power of the database over a spreadsheet. Also, there are many techniques for importing Excel data to SQL, so you should investigate that (Google it). If you find you are transferring all that data by hand then you are doing something wrong (not wrong really, but inefficient!).

Further to your comments, here is the exact schema with index (in MS SQL Server)

CREATE TABLE [dbo].[EngineTestResults](
    [EngineId] [varchar](50) NOT NULL,
    [TestId] [varchar](50) NOT NULL,
    [Tourque] [int] NOT NULL,
    [Power] [decimal](18, 4) NOT NULL,
    [Yield] [decimal](18, 4) NOT NULL,
 CONSTRAINT [PK_EngineTestResults] PRIMARY KEY CLUSTERED 
(
    [EngineId] ASC,
    [TestId] ASC,
    [Tourque] ASC,
    [Power] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

SET ANSI_PADDING OFF
GO


/****** Object:  Index [IX_EngineTestResults]    Script Date: 01/14/2012 14:26:21 ******/
CREATE NONCLUSTERED INDEX [IX_EngineTestResults] ON [dbo].[EngineTestResults] 
(
    [EngineId] ASC,
    [TestId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

So note that there is no incrementing primary key...the key is (EngineId, TestId, Torque, Power). To get the results for a particular engine you would run a query like the following:

     Select * from EngineTestResults where engineId = 'EngineABC' and TestId = 'TestA'

Note that I have added an index for that set of criteria.

Upvotes: 4

Gary Chambers
Gary Chambers

Reputation: 25838

The strength of a relational database is the ability to normalize data across multiple tables, so you could have one table for engines, one for tests and one for results. Something like the following:

CREATE TABLE tbl__engines (
  `engine_id` SMALLINT UNSIGNED NOT NULL,
  `name` VARCHAR(255) NOT NULL,
  PRIMARY KEY(engine_id)
);

CREATE TABLE tbl__tests (
  `test_id` INT UNSIGNED NOT NULL,
  `engine_id` SMALLINT UNSIGNED NOT NULL,
  PRIMARY KEY(test_id),
  FOREIGN KEY(engine_id) REFERENCES tbl__engines(engine_id)
);

CREATE TABLE tbl__test_result (
  `result_id` INT UNSIGNED NOT NULL,
  `test_id` INT UNSIGNED NOT NULL,
  `torque` INT NOT NULL,
  `power` DECIMAL(6,2) NOT NULL,
  `yield` DECIMAL(6,2) NOT NULL,
  FOREIGN KEY(test_id) REFERENCES tbl__tests(test_id)
);

Then you can simply perform a join across these three tables to return the required results. Something like:

SELECT
  *
FROM `tbl__engines` e
INNER JOIN `tbl__tests` t ON e.engine_id = t.engine_id
INNER JOIN `tbl__results` r ON r.test_id = t.test_id;

Upvotes: 1

Related Questions