Reputation: 403
I am working on a stored procedure that is combining history with projections. I have a bit column (PHP) that specifies if the projection has precedence over the history or not. I also have a column that specifies if the data came from the history or projection table. The output of my stored procedure looks like this:
CaseId Year Projection PHP Gas Oil
1 2004 0 1
1 2005 0 1
1 2005 1 1
1 2006 1 1
1 2007 1 1
1 2008 1 1
1 2009 1 1
2 2003 0 0
2 2004 0 0
2 2005 0 0
2 2005 1 0
2 2006 1 0
2 2007 1 0
2 2008 1 0
2 2006 1 0
In this example, I would need to eliminate the second row because for CaseId 1 projection has precedence so the overlapping history date should be removed. Also, the fourth row for CaseId 2 should be deleted because history has precedence.
CaseId Year Projection PHP Gas Oil
1 2004 0 1
1 2005 1 1
1 2006 1 1
1 2007 1 1
1 2008 1 1
1 2009 1 1
2 2003 0 0
2 2004 0 0
2 2005 0 0
2 2006 1 0
2 2007 1 0
2 2008 1 0
2 2006 1 0
I need to flag duplicate years within a CaseId and then compare the Projection and PHP columns and delete the row where they don't match.
Here is the query I am working with:
SELECT rcl.ReportRunCaseId AS CaseId,
year(rce.EcoDate) as Year,
1 as Projection,
cpq.ProjectionHasPrecedence as PHP,
rce.GrossOil as Oil,
rce.GrossGas as Gas
from phdreports.PhdRpt.ReportCaseList_28 rcl
inner join phdreports.PhdRpt.RptCaseEco_28 rce on
rce.ReportRunCaseId = rcl.ReportRunCaseId
inner join dbo.caseQualifier cq on
cq.CorpScenarioId = 1 and
cq.CaseCaseId = rcl.ReportRunCaseId and
cq.CorpQualifierTypeId = 1
inner join dbo.caseProjectionQualifier cpq on
cpq.CaseCaseId = rcl.ReportRunCaseId and
cpq.CorpQualifierId = cq.QualifierHasData
where rcl.ReportRunCaseId <=2
group by year(rce.EcoDate), rcl.ReportRunCaseId, cpq.ProjectionHasPrecedence, rce.GrossGas, rce.GrossOil
union all
select rmp.ReportRunCaseId AS CaseId,
year(rmp.EcoDate) as Year,
0 as Projection,
cpq.ProjectionHasPrecedence as PHP,
rmp.GrossOil as Oil,
rmp.GrossGas as Gas
from PhdReports.PhdRpt.RptMonthlyProduction_50 rmp
inner join dbo.caseQualifier cq on
cq.CorpScenarioId = 1 and
cq.CaseCaseId = rmp.ReportRunCaseId and
cq.CorpQualifierTypeId = 1
inner join dbo.caseProjectionQualifier cpq on
cpq.CaseCaseId = rmp.ReportRunCaseId and
cpq.CorpQualifierId = cq.QualifierHasData
where rmp.ReportRunCaseId <= 2
group by year(rmp.EcoDate), rmp.ReportRunCaseId, cpq.ProjectionHasPrecedence, rmp.GrossGas, rmp.GrossOil
How can I eliminate the duplicate years where Projection and PHP do not match?
Upvotes: 2
Views: 180
Reputation: 69759
The ROW_NUMBER() Function should help you here:
WITH Data AS
( SELECT rcl.ReportRunCaseId AS CaseId,
year(rce.EcoDate) as Year,
1 as Projection,
cpq.ProjectionHasPrecedence as PHP,
rce.GrossOil as Oil,
rce.GrossGas as Gas
from phdreports.PhdRpt.ReportCaseList_28 rcl
inner join phdreports.PhdRpt.RptCaseEco_28 rce on
rce.ReportRunCaseId = rcl.ReportRunCaseId
inner join dbo.caseQualifier cq on
cq.CorpScenarioId = 1 and
cq.CaseCaseId = rcl.ReportRunCaseId and
cq.CorpQualifierTypeId = 1
inner join dbo.caseProjectionQualifier cpq on
cpq.CaseCaseId = rcl.ReportRunCaseId and
cpq.CorpQualifierId = cq.QualifierHasData
where rcl.ReportRunCaseId <=2
group by year(rce.EcoDate), rcl.ReportRunCaseId, cpq.ProjectionHasPrecedence, rce.GrossGas, rce.GrossOil
union all
select rmp.ReportRunCaseId AS CaseId,
year(rmp.EcoDate) as Year,
0 as Projection,
cpq.ProjectionHasPrecedence as PHP,
rmp.GrossOil as Oil,
rmp.GrossGas as Gas
from PhdReports.PhdRpt.RptMonthlyProduction_50 rmp
inner join dbo.caseQualifier cq on
cq.CorpScenarioId = 1 and
cq.CaseCaseId = rmp.ReportRunCaseId and
cq.CorpQualifierTypeId = 1
inner join dbo.caseProjectionQualifier cpq on
cpq.CaseCaseId = rmp.ReportRunCaseId and
cpq.CorpQualifierId = cq.QualifierHasData
where rmp.ReportRunCaseId <= 2
group by year(rmp.EcoDate), rmp.ReportRunCaseId, cpq.ProjectionHasPrecedence, rmp.GrossGas, rmp.GrossOil
), Data2 AS
( SELECT *,
RowNum = ROW_NUMBER() OVER(PARTITION BY CaseId, Year
ORDER BY CASE WHEN PHP = Projection THEN 0 ELSE 1 END DESC, PHP DESC, Projection DESC)
FROM Data
)
SELECT CaseId, Year, Projection, PHP, Oil, Gas
FROM Data2
WHERE RowNum - 1;
Considering only the last bit, as the first is just your query inside a common table expression:
RowNum = ROW_NUMBER() OVER(PARTITION BY CaseId, Year
ORDER BY CASE WHEN PHP = Projection THEN 0 ELSE 1 END DESC, PHP DESC, Projection DESC)
Here, we give each caseId
, year
tuple a rank, ordering by whether or not PHP is equal to projection. Then the final part just limits the result to the first row for each tuple, so if a row exists where they are equal it will take that, if there are no rows where they are equal a row where they are not equal will be used.
You may need to add further criteria to the order by to ensure the results are deterministic, i.e. if you have two rows for the same caseId/Year where both PHP and projection are 1 in both, ensure that the same row is picked every time.
Upvotes: 2
Reputation: 1269663
I don't know what your query has to do with the question. So, let me assume that you have a query that does:
select CaseId, Year, Projection, PHP, Gas, Oil
from t
With this, you can do what you want using row_number()
:
select CaseId, Year, Projection, PHP, Gas, Oil
from (select CaseId, Year, Projection, PHP, Gas, Oil,
row_number() over (partition by CaseId, Year
order by Projection + PHP desc
) as seqnum
from t
) t
where seqnum = 1;
This will prioritize rows based on the number of flags that are set. In your example for CaseId
= 2, the two rows contain the same values. This will return one of those rows. If you want to choose between them, you need another column so specify precedence.
Upvotes: 1