Ammar
Ammar

Reputation: 29

MySQL Multiplication of Hierarchical data in a single parent-children table

I am working on a project whose MySQL database contains two tables; people and percentages.

people table:

+----+------+--------+
| ID | Name | Parent |
+----+------+--------+
|  1 | A    |      0 |
|  2 |  B   |      1 |
|  3 |   C  |      2 |
|  4 |    D |      3 |
|  5 |  E   |      1 |
|  6 | F    |      0 |
+----+------+--------+

Percentages table:

+----+------------+
| ID | Percentage |
+----+------------+
|  1 | 70%        |
|  2 | 60%        |
|  3 | 10%        |
|  4 | 5%         |
|  5 | 40%        |
|  6 | 30%        |
+----+------------+

The query result I am seeking should be as the following:

+----+------------+----------------+--------+
| ID | Percentage |  Calculation   | Actual |
+----+------------+----------------+--------+
|  1 |         70 | 70%            | 70.00% |
|  2 |         60 | 70%*60%        | 42.00% |
|  3 |         10 | 70%*60%*10%    | 4.20%  |
|  4 |          5 | 70%*60%*10%*5% | 0.21%  |
|  5 |         40 | 70%*40%        | 28.00% |
|  6 |         30 | 30%            | 30.00% |
+----+------------+----------------+--------+

The Calculation column is only for elaboration. Is there any MySQL technique that i could use to achieve this hierarchical query? Even if the percentages table might contain multiple entries (percentages) for the same person ?

Upvotes: 2

Views: 787

Answers (5)

murison
murison

Reputation: 3983

Consider switching to Postgres9, which supports recursive queries:

WITH RECURSIVE recp AS (
SELECT p.id, p.name, p.parent
, array[p.id] AS anti_loop
, array[pr.percentage ] AS percentages
, pr.percentage AS final_pr
FROM people p
JOIN percentages pr ON pr.id = p.id
WHERE parent = 0
UNION ALL
SELECT ptree.id, ptree.name, ptree.parent
, recp.anti_loop || ptree.id
, recp.percentages || pr.percentage
, recp.final_pr * pr.percentage
FROM people ptree
JOIN percentages pr ON pr.id = ptree.id
JOIN recp ON recp.id = ptree.parent AND ptree.id != ALL(recp.anti_loop)
)
SELECT id, name
, array_to_string(anti_loop, ' <- ') AS path
, array_to_string(percentages::numeric(10,2)[], ' * ') AS percentages_str
, final_pr
FROM recp
ORDER BY anti_loop

Check out sqlFiddle demo

Upvotes: -1

gwc
gwc

Reputation: 1293

Step 1 - Create a MySQL function to return the family tree as a comma delimited TEXT column:

DELIMITER //

CREATE FUNCTION fnFamilyTree ( id INT ) RETURNS TEXT
BEGIN
   SET @tree = id;
   SET @qid = id;
   WHILE (@qid > 0) DO
      SELECT IFNULL(p.parent,-1)
        INTO @qid
        FROM people p
       WHERE p.id = @qid LIMIT 1;
      IF ( @qid > 0 ) THEN
        SET @tree = CONCAT(@tree,',',@qid);
      END IF;
   END WHILE;
   RETURN @tree;
END
//

DELIMITER ;

Then use the following SQL to retrieve your results:

SELECT ppl.id
      ,ppl.percentage
      ,GROUP_CONCAT(pct.percentage SEPARATOR '*') as Calculations
      ,EXP(SUM(LOG(pct.percentage)))              as Actual
  FROM (SELECT p1.id
              ,p2.percentage
              ,fnFamilyTree( p1.id ) as FamilyTree
          FROM people      p1
          JOIN percentages p2
            ON p2.id = p1.id
       ) ppl
  JOIN percentages pct
    ON FIND_IN_SET( pct.id, ppl.FamilyTree ) > 0
 GROUP BY ppl.id
         ,ppl.percentage
;

SQLFiddle at http://sqlfiddle.com/#!2/9da5b/12

Results:

+------+----------------+-----------------+----------------+
| ID   | Percentage     | Calculations    | Actual         |
+------+----------------+-----------------+----------------+
| 1    | 0.699999988079 | 0.7             | 0.699999988079 |
| 2    | 0.600000023842 | 0.7*0.6         | 0.420000009537 |
| 3    | 0.10000000149  | 0.7*0.6*0.1     | 0.04200000158  |
| 4    | 0.5            | 0.1*0.5*0.7*0.6 | 0.02100000079  |
| 5    | 0.40000000596  | 0.4*0.7         | 0.279999999404 |
| 6    | 0.300000011921 | 0.3             | 0.300000011921 |
+------+----------------+-----------------+----------------+

Upvotes: 1

Menelaos
Menelaos

Reputation: 26279

A solution is to utilize the function described at the following link for heirarchical queries:

Instead of making a PATH though, you will want to calculate the multiplication.

SOLUTION SCRIPT

Copy and paste this directly in a mysql console. I have not had much luck in workbench. Additionally, this can be further optimized by combining hierarchy_sys_connect_by_path_percentage and hierarchy_sys_connect_by_path_percentage_result into one stored procedure. Unfortunately this may be quite slow for giant data sets.

Setup Table and Data

drop table people;
drop table percentages;

create table people
(
  id  int,
  name varchar(10),
  parent int
);

create table percentages
(
   id int,
   percentage float
);

insert into people values(1,' A    ',0);
insert into people values(2,'  B   ',1);
insert into people values(3,'   C  ',2);
insert into people values(4,'    D ',3);
insert into people values(5,'  E   ',1);
insert into people values(6,' F    ',0);

insert into percentages values(1,0.70);
insert into percentages values(2,0.60);
insert into percentages values(3,0.10);
insert into percentages values(4,0.5);
insert into percentages values(5,0.40);
insert into percentages values(6,0.30);

DELIMITER $$

DROP  FUNCTION  IF  EXISTS  `hierarchy_sys_connect_by_path_percentage`$$

CREATE FUNCTION hierarchy_sys_connect_by_path_percentage(
                                       delimiter TEXT, 
                                       node INT) 
                                       RETURNS TEXT
    NOT DETERMINISTIC
    READS SQL DATA
BEGIN
     DECLARE _path TEXT;
     DECLARE _id INT;
     DECLARE _percentage FLOAT;
     DECLARE EXIT HANDLER FOR NOT FOUND RETURN _path;
     SET _id = COALESCE(node, @id);

        SELECT  Percentage
              INTO    _path
         FROM    percentages
         WHERE   id = _id;

     LOOP
              SELECT  parent
              INTO    _id
         FROM    people
         WHERE   id = _id
                    AND COALESCE(id <> @start_with, TRUE);

        SELECT  Percentage
              INTO    _percentage
         FROM    percentages
         WHERE   id = _id;

        SET _path = CONCAT( _percentage , delimiter, _path);
    END LOOP;
END $$


DROP  FUNCTION  IF  EXISTS  `hierarchy_sys_connect_by_path_percentage_result`$$

CREATE FUNCTION hierarchy_sys_connect_by_path_percentage_result(
                                       node INT) 
                                       RETURNS FLOAT
    NOT DETERMINISTIC
    READS SQL DATA
BEGIN
     DECLARE _path TEXT;
     DECLARE _id INT;
     DECLARE _percentage FLOAT;
     DECLARE EXIT HANDLER FOR NOT FOUND RETURN _path;
     SET _id = COALESCE(node, @id);

        SELECT  Percentage
              INTO    _path
         FROM    percentages
         WHERE   id = _id;

     LOOP
              SELECT  parent
              INTO    _id
         FROM    people
         WHERE   id = _id
                    AND COALESCE(id <> @start_with, TRUE);

        SELECT  Percentage
              INTO    _percentage
         FROM    percentages
         WHERE   id = _id;

        SET _path = _percentage *  _path;
    END LOOP;
END $$

DELIMITER ;

Query

SELECT  hi.id AS ID,
        p.Percentage,
        hierarchy_sys_connect_by_path_percentage('*', hi.id) AS Calculation,
        hierarchy_sys_connect_by_path_percentage_result(hi.id) AS Actual        
FROM    people hi
JOIN    percentages p
ON      hi.id = p.id;

Result

+------+------------+-----------------+--------------------+
| ID   | Percentage | Calculation     | Actual             |
+------+------------+-----------------+--------------------+
|    1 |        0.7 | 0.7             |  0.699999988079071 |
|    2 |        0.6 | 0.7*0.6         |  0.419999986886978 |
|    3 |        0.1 | 0.7*0.6*0.1     | 0.0419999994337559 |
|    4 |        0.5 | 0.7*0.6*0.1*0.5 | 0.0210000015795231 |
|    5 |        0.4 | 0.7*0.4         |  0.280000001192093 |
|    6 |        0.3 | 0.3             |  0.300000011920929 |
+------+------------+-----------------+--------------------+

Formatting the numbers is trivial so I leave it to you... More important are optimizations to make less calls on the database.

Upvotes: 2

Strawberry
Strawberry

Reputation: 33945

SELECT a.id
     , ROUND(pa.percentage/100
     * COALESCE(pb.percentage/100,1)
     * COALESCE(pc.percentage/100,1)
     * COALESCE(pd.percentage/100,1) 
     * 100,2) x
  FROM people a 
  LEFT 
  JOIN people b 
    ON b.id = a.parent 
  LEFT 
  JOIN people c 
    ON c.id = b.parent 
  LEFT 
  JOIN people d 
    ON d.id = c.parent
  LEFT
  JOIN percentages pa
    ON pa.id = a.id
  LEFT
  JOIN percentages pb
    ON pb.id = b.id
  LEFT
  JOIN percentages pc
    ON pc.id = c.id
  LEFT
  JOIN percentages pd
    ON pd.id = d.id
;

Upvotes: 0

Markus Malkusch
Markus Malkusch

Reputation: 7878

MySQL is a Relational DBS. Your requirements needs a Graph database.

However if you stay at MySQL there exists a few methods to add a few graph features. One of them is the concept of Nested Sets. But I don't suggest that, as it adds a lot of complexity.

Upvotes: 0

Related Questions