Reputation: 432
I am working with a dataset that looks like this:
ZeroCurveID MarkRunID MarkAsOfDate
90-6589 6656 2/28/2012
90-6589 6656 2/28/2012
90-6589 6656 2/28/2012
14-8898 8898 8/12/2014
14-8898 8898 8/12/2014
14-8898 8898 8/12/2014
14-8898 8898 8/6/2014
14-8898 8898 8/6/2014
14-8898 8898 8/12/2014
14-8898 8898 8/12/2014
14-8898 8898 8/12/2014
14-8898 8898 8/12/2014
14-8898 8898 8/12/2014
For a given ZeroCurveID and MarkRunID, there should only be one MarkAsofDate. So for ZeroCurveID = 14-8898 and MarkRunID = 8898, MarkAsofDate must equal 8/12/2014 and any other values are not correct.
In some cases, I have two MarkAsofDate values for the same MarkRunID and ZeroCurveID. I want to create a query to identify when these instances occur, and to select the larger of the MarkAsofDate value as the correct value. How can I design a Query or VBA subroutine to do this?
Upvotes: 0
Views: 141
Reputation: 1975
The following SQL statement returns all combinations of ZeroCurveID and MarkRunID for which multiple MarkAsOfDates exist, along with the maximum MarkAsOfDate:
SELECT ZeroCurveID, MarkRunID, MAX(MarkAsOfDate)
FROM t
GROUP BY ZeroCurveID, MarkRunID
HAVING COUNT(*) > 1
And this selects all combinations that actually have different MarkAsOfDates, but it does not tell the maximum date:
SELECT ZeroCurveID, MarkRunID, COUNT(DISTINCT MarkAsOfDate) AS nDups
FROM t
GROUP BY ZeroCurveID, MarkRunID
HAVING nDups > 1
Performance-wise, it may be better to first use one of the above SQLs to get a list of entries that actually need to be corrected, and then update them individually (using parameters @zcid
and @mrid
):
UPDATE t
SET MarkAsOfDate = ( SELECT MAX(t2.MarkAsOfDate)
FROM t AS t2
WHERE t2.ZeroCurveID = t.ZeroCurveID AND
t2.MarkRunID = t.MarkRunID )
WHERE ZeroCurveID = @zcid AND MarkRunID = @mrid
Otherwise, you could do it all in one go with this big SQL statement:
UPDATE t
SET MarkAsOfDate = ( SELECT MAX(t2.MarkAsOfDate)
FROM t AS t2
WHERE t2.ZeroCurveID = t.ZeroCurveID AND
t2.MarkRunID = t.MarkRunID )
WHERE 1 < ( SELECT COUNT(DISTINCT t3.MarkAsOfDate)
FROM t AS t3
WHERE t3.ZeroCurveID = t.ZeroCurveID AND
t3.MarkRunID = t.MarkRunID )
Upvotes: 1