Reputation: 396
Lets say I have a table that keeps a history of prices, for example:
_______________________________________________________
|Make | Model | Year | price | date_of_record |
|--------------------------------------------------------
|Mazda | 6 | 2008 | 10,000 | 2013-05-13 00:00:00 |
|Mazda | 6 | 2008 | 11,000 | 2012-05-13 00:00:00 |
|Mazda | 6 | 2008 | 12,000 | 2011-05-13 00:00:00 |
|Mazda | 6 | 2008 | 11,750 | 2010-05-13 00:00:00 |
|Honda | Civic | 2000 | 3,000 | 2011-05-13 00:00:00 |
|Honda | Civic | 2000 | 4,000 | 2010-05-13 00:00:00 |
|Honda | Civic | 2000 | 5,000 | 2009-05-13 00:00:00 |
|Honda | Civic | 2000 | 5,050 | 2008-05-13 00:00:00 |
|Acura | TL | 2009 | 21,000 | 2010-05-13 00:00:00 |
|--------------------------------------------------------
I want to delete entries that are more than 2 years old unless the latest entry is more than 2 years old. It is hard to put into words, basically i want to end up with this:
_______________________________________________________
|Make | Model | Year | price | date_of_record |
|--------------------------------------------------------
|Mazda | 6 | 2008 | 10,000 | 2013-05-13 00:00:00 |
|Mazda | 6 | 2008 | 11,000 | 2012-05-13 00:00:00 |
|Honda | Civic | 2000 | 3,000 | 2011-05-13 00:00:00 |
|Acura | TL | 2009 | 21,000 | 2010-05-13 00:00:00 |
|--------------------------------------------------------
so the query would delete the entries that are more than two years old for the mazda, and the entries that are more than 2 years old for the Honda and Acura, with the exception of the latest.
Entries are inserted to the table only when the price changes, so if the price hasn't changed in 5 years the entry will be 5 years old, but that does not mean that the data is not accurate.
Upvotes: 1
Views: 134
Reputation: 48169
First, lets get a list of all car, make, model, year and the respective latest "date of record". So, you could have a 2007 vehicle AND a 2008 of the same make/model each with a latest date entry of 2011-01-01. You would want to keep the vehicle entry for BOTH 2007 and 2008 models however, your example only showed the old dates would only want to keep the retained data for the Honda Civic and Acural TL respectively.
Run this, just to confirm those results
SELECT
c.`make`,
c.`model`,
c.`year`,
MAX( c.date_of_record ) as LatestRecordDate
from
YourCarPriceTable c
group by
c.`make`,
c.`model`,
c.`year`
having
MAX( c.date_of_record ) < DATE_ADD(CURDATE(),INTERVAL -2 YEAR)
Now, use this as a basis to delete from the current table via a Left Join on the same make, model and year. Two conditions to allow the delete.
1) There is NO Match found in the "KeepThese" subquery and the latest date of record is more than two years old...
2) There IS a match found in the "KeepThese" subquery and the date is older than the most recent that qualified the (too old) result set of KeepThese.
Per your data example, the "KeepThese" result set should include
Make Model Year LatestRecordDate
Honda Civic 2000 2011-05-13 00:00:00
Acura TL 2009 2010-05-13 00:00:00
So, your Mazda 6 would delete the two entries for the 2010 and 2011 date entries since that make/model/year is NOT included in the list (via test IS NULL).
Your Honda Civic will delete all entries that are older than it's latest "KeepThese" value of 2011-05-13, and thus delete the 2008, 2009 and 2010 entries.
Your Acura TL does not have anything older than it's single entry, so it would be left alone.
DELETE YPT.* FROM
YourCarPriceTable YPT
LEFT JOIN ( SELECT
c.`make`,
c.`model`,
c.`year`,
MAX( c.date_of_record ) as LatestRecordDate
from
YourCarPriceTable c
group by
c.`make`,
c.`model`,
c.`year`
having
MAX( c.date_of_record ) < DATE_ADD(CURDATE(),INTERVAL -2 YEAR) ) KeepThese
ON YPT.`make` = KeepThese.`make`
AND YPT.`model` = KeepThese.`model`
AND YPT.`year` = KeepThese.`year`
where
( KeepThese.`make` IS NULL
AND YPT.date_of_record < DATE_ADD(CURDATE(),INTERVAL -2 YEAR))
OR
( YPT.`make` = KeepThese.`make`
AND YPT.`model` = KeepThese.`model`
AND YPT.`year` = KeepThese.`year`
AND YPT.date_of_record < KeepThese.LatestRecordDate )
The subquery is processed once all up front, THEN left-joined to the delete from version of your car/price table.
Upvotes: 2