SULTAN
SULTAN

Reputation: 1159

sum of cases in grand-parent, parents, children,, and etc

What I need is to count the total number of cases on: parent cities, districts, and regions

So let me please tell you what the situation is and what I have done so far


I have two tables [cities] & [covid19cities]


The [cities] table: reference table

Structure is:

------------------------------
| id | parent_id | city_name |
------------------------------

Cities levels is:

 - Region          //[its parent_id = 0]
 -- District       //[its parent_id = the region id]
 ---- Parent-city  //[its parent_id = the district id]
 ------ Child-city //[its parent_id = the parent-city id]

The [covid19cities] table:

Structure is:

-----------------------------------------------------
| id | city_id | date | n_cases | r_cases | d_cases |
-----------------------------------------------------

So each day we fill [covid19cities] with the different cases in different cities:

n_cases = new covid-19 cases; r_cases = recovered cases; d_cases = deceased cases


Up to this point:

  1. I am able to get the sum of cases (e.g. new cases) in each city using query like this:

    SELECT  sum(`n_cases`) AS city_n_cases, cities.name AS city_name,
            cities.id AS city_id,
        FROM  covid19cities
        INNER JOIN  cities  ON cities.id = covid19cities.city_id
        WHERE  covid19cities.city_id = '#'
    
    1. I am able to get sum of all cases (e.g. new cases) in all cities:
SELECT 
sum(`n_cases`) AS total_n_cases, 
FROM 
covid19cities 

Now, what I need is to count the total number of cases on:

So, how can I accomplish that? What I thought about is

  1. to find all regions
  2. within the fetch assoc while-loop I search for the districts of the this region
  3. within the fetch assoc while-loop of the districts I search for the parent-cities
  4. within the fetch assoc while-loop of parent cities I search for the children cities
  5. count the sum and then added back-wards to the parent cities and from there to districts and then to regions!

BUT I believe this is not how it should be done. However, I do not know how to keep tracking parent-children cities in such case.

I appreciate your advice and help.

Thanks

p.s. sorry for my English :/

Upvotes: 1

Views: 165

Answers (2)

Rick James
Rick James

Reputation: 142383

Since there are only 3 levels, not an arbitrary number of levels, I suggest that you are working too hard. Have a single table with 3 columns for the region, district, and city spelled out in the table. That's only a few thousand rows per day, so the lack of normalization will not lead to a huge disk expense.

If, on the other hand, this is a learning exercise, then get MySQL 8 or MariaDB 10.2 and learn about "Recursive CTEs".

Upvotes: 1

GMB
GMB

Reputation: 222582

Consider the following base query, that gives you the sum of each category of cases per city_id. We can get that information by looking at covid19cities only:

select 
    cvc.city_id,
    sum(cvc.n_cases) sum_n_cases,
    sum(cvc.r_cases) sum_r_cases,
    sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
group by cvc.city_id

Now let's bring the cities table. This gives the same results, and you can display the city name too:

select 
    c.id   city_id,
    c.name city_name, 
    sum(cvc.n_cases) sum_n_cases,
    sum(cvc.r_cases) sum_r_cases,
    sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
inner join cities c on c.id = cvc.city_id
group by c.id, c.name

Starting from there, we can follow the hierarchy upwards, level by level, by adding more joins on the cities table and changing the columns in the select and group by clauses.

Let's get the number of cases per parent city: we join the cities table a second time, with alias pc (for parent city):

select 
    pc.id   parent_city_id,
    pc.name parent_city_name, 
    sum(cvc.n_cases) sum_n_cases,
    sum(cvc.r_cases) sum_r_cases,
    sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
inner join cities c  on c.id  = cvc.city_id
inner join cities pc on pc.id = c.parent_id
group by pc.id, pc.name

The next level is the district:

select 
    d.id   distict_id,
    d.name district_name, 
    sum(cvc.n_cases) sum_n_cases,
    sum(cvc.r_cases) sum_r_cases,
    sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
inner join cities c  on c.id  = cvc.city_id
inner join cities pc on pc.id =  c.parent_id
inner join cities d  on d.id  = pc.parent_id
group by d.id, d.name

Finally, here is the query that gives the information at the upper level, that is the region:

select 
    r.id   region_id,
    r.name region_name, 
    sum(cvc.n_cases) sum_n_cases,
    sum(cvc.r_cases) sum_r_cases,
    sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
inner join cities c  on c.id  = cvc.city_id
inner join cities pc on pc.id =  c.parent_id
inner join cities d  on d.id  = pc.parent_id
inner join cities r  on r.id  =  d.parent_id
group by r.id, r.name

As a more general thought: note that the complexity here comes from the fact that you are storing hierachical structure in the cities table. It would be much simpler to have separated tables to store each entity, with foreign keys to represent the relationships, like:

regions:        region_id, region_name
districts:      district_id, district_name, region_id
parent_cities:  parent_city_id, parent_city_name, district_id
cities:         city_id, city_name, parent_city_id

With this design, your last query would look like:

select 
    r.region_id,
    r.region_name, 
    sum(cvc.n_cases) sum_n_cases,
    sum(cvc.r_cases) sum_r_cases,
    sum(cvc.d_cases) sum_d_cases
from covid19cities cvc
inner join cities c         on c.id              = cvc.city_id
inner join parent_cities pc on pc.parent_city_id =  c.parent_city_id
inner join districts     d  on d.district_id     = pc.district_id
inner join regions       r  on r.region_id       =  d.region_id
group by r.region_id, r.region_name

That's the same number of joins, but things are stored in different tables so the query is much easier to write and read.

Upvotes: 2

Related Questions