Reputation: 1159
What I need is to count the total number of cases on: parent cities
, districts
, and regions
So let me please tell you what the situation is and what I have done so far
I have two tables [cities]
& [covid19cities]
The [cities] table: reference table
Structure is:
------------------------------
| id | parent_id | city_name |
------------------------------
Cities levels is:
- Region //[its parent_id = 0]
-- District //[its parent_id = the region id]
---- Parent-city //[its parent_id = the district id]
------ Child-city //[its parent_id = the parent-city id]
The [covid19cities] table:
Structure is:
-----------------------------------------------------
| id | city_id | date | n_cases | r_cases | d_cases |
-----------------------------------------------------
So each day we fill [covid19cities]
with the different cases in different cities:
n_cases
= new covid-19 cases; r_cases
= recovered cases; d_cases
= deceased cases
Up to this point:
I am able to get the sum of cases (e.g. new cases) in each city using query like this:
SELECT sum(`n_cases`) AS city_n_cases, cities.name AS city_name,
cities.id AS city_id,
FROM covid19cities
INNER JOIN cities ON cities.id = covid19cities.city_id
WHERE covid19cities.city_id = '#'
SELECT sum(`n_cases`) AS total_n_cases, FROM covid19cities
Now, what I need is to count the total number of cases on:
So, how can I accomplish that? What I thought about is
BUT I believe this is not how it should be done. However, I do not know how to keep tracking parent-children cities in such case.
I appreciate your advice and help.
Thanks
p.s. sorry for my English :/
Upvotes: 1
Views: 165
Reputation: 142383
Since there are only 3 levels, not an arbitrary number of levels, I suggest that you are working too hard. Have a single table with 3 columns for the region, district, and city spelled out in the table. That's only a few thousand rows per day, so the lack of normalization will not lead to a huge disk expense.
If, on the other hand, this is a learning exercise, then get MySQL 8 or MariaDB 10.2 and learn about "Recursive CTEs".
Upvotes: 1
Reputation: 222582
Consider the following base query, that gives you the sum of each category of cases per city_id
. We can get that information by looking at covid19cities
only:
select
cvc.city_id,
sum(cvc.n_cases) sum_n_cases,
sum(cvc.r_cases) sum_r_cases,
sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
group by cvc.city_id
Now let's bring the cities
table. This gives the same results, and you can display the city name too:
select
c.id city_id,
c.name city_name,
sum(cvc.n_cases) sum_n_cases,
sum(cvc.r_cases) sum_r_cases,
sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
inner join cities c on c.id = cvc.city_id
group by c.id, c.name
Starting from there, we can follow the hierarchy upwards, level by level, by adding more joins on the cities
table and changing the columns in the select
and group by
clauses.
Let's get the number of cases per parent city: we join the cities
table a second time, with alias pc
(for parent city):
select
pc.id parent_city_id,
pc.name parent_city_name,
sum(cvc.n_cases) sum_n_cases,
sum(cvc.r_cases) sum_r_cases,
sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
inner join cities c on c.id = cvc.city_id
inner join cities pc on pc.id = c.parent_id
group by pc.id, pc.name
The next level is the district:
select
d.id distict_id,
d.name district_name,
sum(cvc.n_cases) sum_n_cases,
sum(cvc.r_cases) sum_r_cases,
sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
inner join cities c on c.id = cvc.city_id
inner join cities pc on pc.id = c.parent_id
inner join cities d on d.id = pc.parent_id
group by d.id, d.name
Finally, here is the query that gives the information at the upper level, that is the region:
select
r.id region_id,
r.name region_name,
sum(cvc.n_cases) sum_n_cases,
sum(cvc.r_cases) sum_r_cases,
sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
inner join cities c on c.id = cvc.city_id
inner join cities pc on pc.id = c.parent_id
inner join cities d on d.id = pc.parent_id
inner join cities r on r.id = d.parent_id
group by r.id, r.name
As a more general thought: note that the complexity here comes from the fact that you are storing hierachical structure in the cities
table. It would be much simpler to have separated tables to store each entity, with foreign keys to represent the relationships, like:
regions: region_id, region_name
districts: district_id, district_name, region_id
parent_cities: parent_city_id, parent_city_name, district_id
cities: city_id, city_name, parent_city_id
With this design, your last query would look like:
select
r.region_id,
r.region_name,
sum(cvc.n_cases) sum_n_cases,
sum(cvc.r_cases) sum_r_cases,
sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
inner join cities c on c.id = cvc.city_id
inner join parent_cities pc on pc.parent_city_id = c.parent_city_id
inner join districts d on d.district_id = pc.district_id
inner join regions r on r.region_id = d.region_id
group by r.region_id, r.region_name
That's the same number of joins, but things are stored in different tables so the query is much easier to write and read.
Upvotes: 2