Reputation: 1412
I am writing a little code to optimize an NFL fantasy draft. Because the database is fairly large, my first task is to cut it down to size by removing entries which I know for a fast will not form part of the optimal draft.
Each player has a salary which I have to pay, and a projected fantasy point total. For example, here is part of the list of possible defenders:
Id Position FPPG Salary
0 12542 D 3.8 4000
1 12534 D 7.5 4100
2 12548 D 4.6 4200
3 12532 D 8.2 4300
4 12547 D 8.4 4400
5 12545 D 9.2 4500
6 12525 D 9.0 4600
7 12553 D 12.3 4700
8 12544 D 10.8 4800
9 12531 D 17.0 5000
I have already written SQL queries that will eliminate players with injuries, and will eliminate all but the highest projected scorer at each salary price point:
d_db = sqldf('SELECT Id, Position, FPPG, Salary from d_db\
WHERE FPPG = (SELECT max(FPPG) from d_db as db where db.Salary = d_db.Salary) ORDER By Salary',locals())
My task now is to eliminate all the players for which there exists a cheaper player who is projected to get more points. For example, in the list above, row #2 is useless and can be eliminated since row #1 is both cheaper and has a higher FPPG.
Put another way, I want to delete all rows for which the FPPG column is not sorted in ascending order when the db is sorted by Salary. The final result would ideally be:
Id Position FPPG Salary
0 12542 D 3.8 4000
1 12534 D 7.5 4100
2 12532 D 8.2 4300
3 12547 D 8.4 4400
4 12545 D 9.2 4500
5 12553 D 12.3 4700
6 12531 D 17.0 5000
Any suggestions? I can't seem to figure out a WHERE condition for a DELETE statement that covers this nicely, so maybe this is more complicated than a single query?
Upvotes: 1
Views: 256
Reputation: 48197
Using LEFT JOIN
you try to match each player with someone better and cheaper.
This for debug
SELECT *
FROM Table1 t1
LEFT JOIN Table1 t2
ON t1."FPPG" < t2."FPPG"
AND t1."Salary" > t2."Salary"
OUTPUT
| row | Id | Position | FPPG | Salary | row | Id | Position | FPPG | Salary |
|-----|-------|----------|------|--------|--------|--------|----------|--------|--------|
| 0 | 12542 | D | 4 | 4000 | (null) | (null) | (null) | (null) | (null) |
| 1 | 12534 | D | 8 | 4100 | (null) | (null) | (null) | (null) | (null) |
| 2 | 12548 | D | 5 | 4200 | 1 | 12534 | D | 8 | 4100 |
| 3 | 12532 | D | 8 | 4300 | (null) | (null) | (null) | (null) | (null) |
| 4 | 12547 | D | 8 | 4400 | (null) | (null) | (null) | (null) | (null) |
| 5 | 12545 | D | 9 | 4500 | (null) | (null) | (null) | (null) | (null) |
| 6 | 12525 | D | 9 | 4600 | (null) | (null) | (null) | (null) | (null) |
| 7 | 12553 | D | 12 | 4700 | (null) | (null) | (null) | (null) | (null) |
| 8 | 12544 | D | 11 | 4800 | 7 | 12553 | D | 12 | 4700 |
| 9 | 12531 | D | 17 | 5000 | (null) | (null) | (null) | (null) | (null) |
Your final query, return rows without a match t2."Id" IS NULL
.
SELECT t1.*
FROM Table1 t1
LEFT JOIN Table1 t2
ON t1."FPPG" < t2."FPPG"
AND t1."Salary" > t2."Salary"
WHERE
t2."Id" IS NULL
NOTE:
AND t1."Id" <> t2."Id"
because previous comparasion already validate have different values.FPPG
and Salary
Upvotes: 1