Reputation: 1
I am trying to run a rolling tally as time goes on, so the second (duplicate) row is adding the same value again and is mucking up the count. How do I stop this from happening? I only want 1 row per date to then add the next date. I have tried distinct but the second row count still appears so I think I need a different command.
See on the output code at the bottom, the last column is total, the second last is the new amount to add to the total
Select
dea.continent, dea.location, dea.date, dea.population,
vac.new_vaccinations,
sum(cast(vac.new_vaccinations as bigint)) over (partition by dea.location order by dea.location, dea.date)
from
PortfolioProject..CovidDeaths dea
join
PortfolioProject..CovidVaccinations vac on dea.location = vac.location
and dea.date = vac.date
where
dea.continent is not null
order by
2,3
Europe Austria 2020-12-28 00:00:00.000 8922082 1345 2690
Europe Austria 2020-12-28 00:00:00.000 8922082 1345 2690
Europe Austria 2020-12-29 00:00:00.000 8922082 1694 6078
Europe Austria 2020-12-29 00:00:00.000 8922082 1694 6078
Europe Austria 2020-12-30 00:00:00.000 8922082 1429 8936
Europe Austria 2020-12-30 00:00:00.000 8922082 1429 8936
Europe Austria 2020-12-31 00:00:00.000 8922082 25 8986
Europe Austria 2020-12-31 00:00:00.000 8922082 25 8986
Europe Austria 2021-01-01 00:00:00.000 8922082 19 9024
Europe Austria 2021-01-01 00:00:00.000 8922082 19 9024
Upvotes: 0
Views: 84
Reputation: 168
Partition by clause don't reduce returned rows. Just add aggregate data to rows. Because of aggregate on CovidVaccinations table, you assumes there are many records for vac.location + vac.date.
So you should first prepare data from CovidVaccinations with;
SELECT vac.location, vac.date, sum( vac.new_vaccinations )
FROM portfolioproject..CovidVaccinations vac
GROUP BY vac.location, vac.date
then join with CovidDeaths without aggregate again, ike that;
WITH vac
( SELECT vac.location, vac.date, sum( vac.new_vaccinations ) as total
FROM portfolioproject..CovidVaccinations vac
GROUP BY vac.location, vac.date
)
SELECT vac.total, ... FROM CovidDeaths dea, vac
WHERE dea.location = vac.location
AND dea.date = vac.date
If CovidDeaths table also has many record per dea.continent, dea.location, dea.date, dea.population, than group its data as well, then join with other result
Upvotes: 1