Reputation: 309
I have a database where I would like to construct a result that match my indata.
Explanation of examples: "Indata" is a row with criteria for the filter in step #1, see below. "Results after filter #1" is the rows left after step #1 "Results after filter #2" is the rows left after step #2 "Print for indata 1" is the end result, presenting the contect of field OutData after the second filtering, along with the combined content of field "Warn"
Step #1. Filter the Outdata table to only keep records matching Indata. Star * is matching anything. Each row must match the indata row. See examples.
Step #1b. Indata Depl is not matched directly but checked to be in the range OutData Depl-DeplOffMin>Indata Depl
Step #2. Filter Outdata table again, to only keep unique rows for each Pos, select the row with lowest Mismatch when duplicates exist. See examples.
Step #3. Create all possible combinations of the Outdata Field, with unique Pos. Like LOOP: OutData for first Pos=10 + OutData for first Pos=20 + OutData for first Pos=30 + OutData for first Pos=30, see "Print" lines in the examples.
Step #4. Sort the output combinations ie the "Print" lines, to have lowest mismatch on the top. See examples.
Would you try to do this as one big SQL query, or build temporary tables etc mixing it with PHP doing more of the work?
(The real database is perhaps 1000rows and 30 fields in MySQL. The Indata is sent to the server via Ajax and the server code in PHP generates the Print data and sends it back.)
"Outdata" table, example:
Pos OutData Warn Mismatch Producer Depl DeplOffMax DeplOffMin Axis Connection
10 S 0 S * * * * *
20 24 0 S 24 * * * *
20 24 +-5 5 S 24 5 -5 * *
20 24 +-10 10 S 24 10 -10 * *
20 48 0 S 48 * * * *
30 AA 0 S * * * A *
30 AB 0 S * * * B *
30 AC 0 S * * * C *
30 AA B-AA 5 S * * * B
40 C1 0 S * * * * C1
40 C1 1 S * * * B *
40 C1 2 S 24 10 -10 * *
40 C2 0 S * * * * C2
Example 1:
Indat 1: S 24 A C2
Result for indata 1 after filter #1:
10 S 0 S * * * * *
20 24 0 S 24 * * * *
20 24 +-5 5 S 24 5 -5 * *
20 24 +-10 10 S 24 10 -10 * *
30 AA 0 S * * * A *
40 C1 2 S 24 10 -10 * *
40 C2 0 S * * * * C2
Result for indata 1 after filter #2:
10 S 0 S * * * * *
20 24 0 S 24 * * * *
30 AA 0 S * * * A *
40 C1 2 S 24 10 -10 * *
40 C2 0 S * * * * C2
Print for indata 1:
Mismatch 0: S 24 AA C2 Warning -
Mismatch 2: S 24 AA C1 Warning -
Example 2:
Indata 2: S 33 B C2
Result for indata 2 after filter #1:
10 S 0 S * * * * *
20 24 +-10 10 S 24 10 -10 * *
30 AB 0 S * * * B *
30 AA B-AA 5 S * * * B
40 C1 1 S * * * B *
40 C1 2 S 24 10 -10 * *
40 C2 0 S * * * * C2
Result for indata 2 after filter #2:
10 S 0 S * * * * *
20 24 +-10 10 S 24 10 -10 * *
30 AB 0 S * * * B *
30 AA B-AA 5 S * * * B
40 C1 1 S * * * B *
40 C2 0 S * * * * C2
Print for indata 2:
Mismatch 10: S 24 AB C2 Warning: +-10
Mismatch 11: S 24 AB C1 Warning: +-10
Mismatch 15: S 24 AA C2 Warning: +-10 B-AA
Mismatch 16: S 24 AA C1 Warning: +-10 B-AA
Example 3:
Indata 3: S 28 B C1
Result for indata 3 after filter #1:
10 S 0 S * * * * *
20 24 +-5 5 S 24 5 -5 * *
20 24 +-10 10 S 24 10 -10 * *
30 AB 0 S * * * B *
30 AA B-AA 5 S * * * B
40 C1 0 S * * * * C1
40 C1 1 S * * * B *
40 C1 2 S 24 10 -10 * *
Result for indata 3 after filter #2:
10 S 0 S * * * * *
20 24 +-5 5 S 24 5 -5 * *
30 AB 0 S * * * B *
30 AA B-AA 5 S * * * B
40 C1 0 S * * * * C1
Print for indata 3:
Mismatch 5: S 24 AB C1 Warning: +-5
Mismatch 10: S 24 AA C1 Warning: +-5 B-AA
Upvotes: 2
Views: 206
Reputation: 171491
I don't fully understand the business logic, but for such a tiny amount of data I would not bother going through the pain of trying to implement in SQL. It is likely to be faster just pulling all the data over and using PHP to apply your filters.
Upvotes: 1
Reputation: 36879
One thing I've learned. Anything that can be calculated should not be saved in the database
Upvotes: 0
Reputation: 391952
First, the question is too long.
Second, the rules (#1, #1b, #2, #3, and #4) are confusing and need clarification.
However, in spite of that, this is my advice.
If it's not obvious how to do it in SQL, don't.
For something like this, with "rules" for matching ("star is matching anything", "select the row with lowest Mismatch when duplicates exist", etc.) you have two choices.
Simplify the processing steps so that they can be implemented in SQL. This is generally a good thing. Your current statement is confused and hard to follow. Time spent simplifying would be time well spent.
Use an ordinary programming language. When you reach things which are hard to do in SQL, don't "force" them. Get the data from SQL and process it in PHP.
SQL -- generally -- is slow. It's best used when you have vast volumes of data that won't fit in memory and you have transactional updates going on. 1000 rows and 30 fields is a trivial amount of data that easily fits in memory. If you're given a batch of data to analyze (with no updates going on) SQL's not helping much.
In a very busy website -- using PHP -- you may have problems with lots of concurrent users. In which case, get it out of PHP so it's not running under Apache. Until such time as Apache runs out of memory, just do it in PHP.
Upvotes: 2
Reputation: 48367
I have no idea what step 3 means, but all the other steps can be done in SQL, also, they probably should be done in SQL where its a lot more efficient.
I assume from your tags that you're using a MySQL database (it would have been helpful to know).
My initial thots would be somethnig along the lines of:
SELECT outdata.Pos,
outdata.OutData,
outdata.Warn,
MIN(outdata.Mismatch),
outdata.Producer,
outdata.Depl,
outdata.DeplOffMax,
outdata.DeplOffMin,
outdata.Axis
outdata.Connection
FROM outdata, indata
WHERE
(//some nasty type juggling here
outdata.depl='*'
OR outdata.deploffmin='*'
OR indata.depl='*'
OR (
outdata.depl<>'*'
AND outdata.deploffmin<>'*'
AND indata.depl<>'*'
AND outdata.depl-outdata.deploffmin>indata.depl
)
)
AND ( outdata.outdata=indata.outdata
OR outdata.outdata='*'
OR indata.outdata='*' )
AND ( outdata.connection=indata.connection
OR outdata.connection = '*'
OR indata.connection='*' )
AND (outdata.axis=indata.axis
OR outdata.axis='*'
OR indata.axis='*' )
....repeat for all the fields you want to match
GROUP BY outdata.Pos,
outdata.OutData,
outdata.Warn,
outdata.Producer,
outdata.Depl,
outdata.DeplOffMax,
outdata.DeplOffMin,
outdata.Axis
outdata.Connection
ORDER BY 4 ASC;
Simples.
C.
Upvotes: 1