Petter Magnusson
Petter Magnusson

Reputation: 309

How much of this task can/should be done in SQL?

I have a database where I would like to construct a result that match my indata.

Explanation of examples: "Indata" is a row with criteria for the filter in step #1, see below. "Results after filter #1" is the rows left after step #1 "Results after filter #2" is the rows left after step #2 "Print for indata 1" is the end result, presenting the contect of field OutData after the second filtering, along with the combined content of field "Warn"

Step #1. Filter the Outdata table to only keep records matching Indata. Star * is matching anything. Each row must match the indata row. See examples.

Step #1b. Indata Depl is not matched directly but checked to be in the range OutData Depl-DeplOffMin>Indata Depl

Step #2. Filter Outdata table again, to only keep unique rows for each Pos, select the row with lowest Mismatch when duplicates exist. See examples.

Step #3. Create all possible combinations of the Outdata Field, with unique Pos. Like LOOP: OutData for first Pos=10 + OutData for first Pos=20 + OutData for first Pos=30 + OutData for first Pos=30, see "Print" lines in the examples.

Step #4. Sort the output combinations ie the "Print" lines, to have lowest mismatch on the top. See examples.

Would you try to do this as one big SQL query, or build temporary tables etc mixing it with PHP doing more of the work?

(The real database is perhaps 1000rows and 30 fields in MySQL. The Indata is sent to the server via Ajax and the server code in PHP generates the Print data and sends it back.)

"Outdata" table, example:

Pos   OutData     Warn  Mismatch    Producer    Depl        DeplOffMax        DeplOffMin  Axis        Connection
10    S                 0           S           *           *                 *                       *           *
20    24                0           S           24          *                 *           *           *
20    24          +-5   5           S           24          5                 -5          *           *
20    24          +-10  10          S           24          10                -10         *           *
20    48                0           S           48          *                 *           *           *
30    AA                0           S           *           *                 *           A           *
30    AB                0           S           *           *                 *           B           *
30    AC                0           S           *           *                 *           C           *
30    AA          B-AA  5           S           *           *                 *           B
40    C1                0           S           *           *                 *           *           C1
40    C1                1           S           *           *                 *           B           *
40    C1                2           S           24          10                -10         *           *
40    C2                0           S           *           *                 *           *           C2





Example 1:
Indat 1:                           S           24                                        A           C2

Result for indata 1 after filter #1:
10    S                 0           S           *           *                 *                       *           *
20    24                0           S           24          *                 *           *           *
20    24          +-5   5           S           24          5                 -5          *           *
20    24          +-10  10          S           24          10                -10         *           *
30    AA                0           S           *           *                 *           A           *
40    C1                2           S           24          10                -10         *           *
40    C2                0           S           *           *                 *           *           C2

Result for indata 1 after filter #2:
10    S                 0           S           *           *                 *                       *           *
20    24                0           S           24          *                 *           *           *
30    AA                0           S           *           *                 *           A           *
40    C1                2           S           24          10                -10         *           *
40    C2                0           S           *           *                 *           *           C2

Print for indata 1:
Mismatch 0:       S 24 AA C2  Warning -
Mismatch 2:       S 24 AA C1  Warning -



Example 2:
Indata 2:                           S           33                                        B           C2

Result for indata 2 after filter #1:
10    S                 0           S           *           *                 *                       *           *
20    24          +-10  10          S           24          10                -10         *           *
30    AB                0           S           *           *                 *           B           *
30    AA          B-AA  5           S           *           *                 *           B
40    C1                1           S           *           *                 *           B           *
40    C1                2           S           24          10                -10         *           *
40    C2                0           S           *           *                 *           *           C2

Result for indata 2 after filter #2:
10    S                 0           S           *           *                 *                       *           *
20    24          +-10  10          S           24          10                -10         *           *
30    AB                0           S           *           *                 *           B           *
30    AA          B-AA  5           S           *           *                 *           B
40    C1                1           S           *           *                 *           B           *
40    C2                0           S           *           *                 *           *           C2

Print for indata 2:
Mismatch 10:      S 24 AB C2  Warning: +-10
Mismatch 11:      S 24 AB C1  Warning: +-10
Mismatch 15:      S 24 AA C2  Warning: +-10 B-AA
Mismatch 16:      S 24 AA C1  Warning: +-10 B-AA





Example 3:
Indata 3:                           S           28                                        B           C1

Result for indata 3 after filter #1:
10    S                 0           S           *           *                 *                 *           *
20    24          +-5   5           S           24          5                 -5          *           *
20    24          +-10  10          S           24          10                -10         *           *
30    AB                0           S           *           *                 *           B           *
30    AA          B-AA  5           S           *           *                 *           B
40    C1                0           S           *           *                 *           *           C1
40    C1                1           S           *           *                 *           B           *
40    C1                2           S           24          10                -10         *           *

Result for indata 3 after filter #2:
10    S                 0           S           *           *                 *                 *           *
20    24          +-5   5           S           24          5                 -5          *           *
30    AB                0           S           *           *                 *           B           *
30    AA          B-AA  5           S           *           *                 *           B
40    C1                0           S           *           *                 *           *           C1

Print for indata 3:
Mismatch 5:       S 24 AB C1  Warning: +-5
Mismatch 10:      S 24 AA C1  Warning: +-5 B-AA

Upvotes: 2

Views: 206

Answers (4)

D'Arcy Rittich
D'Arcy Rittich

Reputation: 171491

I don't fully understand the business logic, but for such a tiny amount of data I would not bother going through the pain of trying to implement in SQL. It is likely to be faster just pulling all the data over and using PHP to apply your filters.

Upvotes: 1

Elitmiar
Elitmiar

Reputation: 36879

One thing I've learned. Anything that can be calculated should not be saved in the database

Upvotes: 0

S.Lott
S.Lott

Reputation: 391952

First, the question is too long.

Second, the rules (#1, #1b, #2, #3, and #4) are confusing and need clarification.

However, in spite of that, this is my advice.

If it's not obvious how to do it in SQL, don't.

For something like this, with "rules" for matching ("star is matching anything", "select the row with lowest Mismatch when duplicates exist", etc.) you have two choices.

  1. Simplify the processing steps so that they can be implemented in SQL. This is generally a good thing. Your current statement is confused and hard to follow. Time spent simplifying would be time well spent.

  2. Use an ordinary programming language. When you reach things which are hard to do in SQL, don't "force" them. Get the data from SQL and process it in PHP.

SQL -- generally -- is slow. It's best used when you have vast volumes of data that won't fit in memory and you have transactional updates going on. 1000 rows and 30 fields is a trivial amount of data that easily fits in memory. If you're given a batch of data to analyze (with no updates going on) SQL's not helping much.

In a very busy website -- using PHP -- you may have problems with lots of concurrent users. In which case, get it out of PHP so it's not running under Apache. Until such time as Apache runs out of memory, just do it in PHP.

Upvotes: 2

symcbean
symcbean

Reputation: 48367

I have no idea what step 3 means, but all the other steps can be done in SQL, also, they probably should be done in SQL where its a lot more efficient.

I assume from your tags that you're using a MySQL database (it would have been helpful to know).

My initial thots would be somethnig along the lines of:

 SELECT outdata.Pos,
     outdata.OutData,
     outdata.Warn,
     MIN(outdata.Mismatch),    
     outdata.Producer,
     outdata.Depl,
     outdata.DeplOffMax,        
     outdata.DeplOffMin,
     outdata.Axis
     outdata.Connection
 FROM outdata, indata
 WHERE
   (//some nasty type juggling here
     outdata.depl='*'
     OR outdata.deploffmin='*'
     OR indata.depl='*'
     OR (
        outdata.depl<>'*'
        AND outdata.deploffmin<>'*'
        AND indata.depl<>'*'
        AND outdata.depl-outdata.deploffmin>indata.depl
     )
   )  
 AND ( outdata.outdata=indata.outdata
   OR outdata.outdata='*'
   OR indata.outdata='*' )
 AND ( outdata.connection=indata.connection
   OR outdata.connection = '*'
   OR indata.connection='*' )
 AND (outdata.axis=indata.axis
   OR outdata.axis='*' 
   OR indata.axis='*' )
 ....repeat for all the fields you want to match       
 GROUP BY outdata.Pos,
     outdata.OutData,
     outdata.Warn,    
     outdata.Producer,
     outdata.Depl,
     outdata.DeplOffMax,        
     outdata.DeplOffMin,
     outdata.Axis
     outdata.Connection
 ORDER BY 4 ASC;

Simples.

C.

Upvotes: 1

Related Questions