user7304434
user7304434

Reputation: 1

Split the count per product from one table

I have a table like this:

Count Product
100 apple, orange, mango
50 apple, grape, avocado
20 orange, apple, avocado

How can I select to get the count per product like this?

Count Product
170 apple
120 orange
100 mango
70 avocado
50 grape

Upvotes: 0

Views: 60

Answers (2)

michelek
michelek

Reputation: 2536

You really-really need to fix the table. Maybe that's what you are trying to achieve here.

I personally would:

  • take all the product names from table to text file
  • replace commas with newlines
  • trim extra whitespace
  • make all lowercase
  • remove duplicates (maybe even order alphabetically)
  • import into database in a new table (product)

.

CREATE TABLE product 
  ( 
     id      INT(11) UNSIGNED NOT NULL auto_increment, 
     product VARCHAR(50) NOT NULL DEFAULT '', 
     PRIMARY KEY (id), 
     KEY product (product) 
  ) 
engine=innodb 
DEFAULT charset=utf8; 

Now I have all the possible product names (apple apples avocado grapes mango orange) and can start rebuilding your source table:

SELECT p.product, 
       Sum(src.count) 
FROM   product p 
       LEFT JOIN src 
              ON src.product REGEXP p.product 
GROUP  BY p.product
;
--
product Sum(src.count)
apple   170
apples  150
avocado 70
grapes  50
mango   100
orange  120

...umm what's with apples?

One possible solution would be to replace all 'apples' with 'apple'

SELECT Concat('UPDATE src SET product = Replace(product, \'', p2.product, '\', \'', p1.product, '\');') AS q 
FROM   product p1 
       LEFT JOIN product p2 
              ON p1.product != p2.product 
                 AND p2.product REGEXP p1.product 
WHERE  p2.product IS NOT NULL
;
--
q
UPDATE src SET product = Replace(product, 'apples', 'apple');

Mysql's replace is case sensitive so we start by

UPDATE src 
SET    product = Lower(product); 

and now we can run the result from previous query:

UPDATE src SET product = Replace(product, 'apples', 'apple');
--
2 rows affected

Our modified source table:

SELECT * FROM src
;
-
Count   Product
100 apple, orange, mango
50  apple, grapes, avocado
20  orange, apple, avocado

Let's start over with

  • take all the product names from table to text file
  • replace commas with newlines
  • trim extra whitespace
  • (this time case and sorting really doesn't apply)
  • remove duplicates
  • import into database in a truncated product table

And next query would make me happy:

CREATE TABLE inventory AS 
  SELECT p.product, 
         Sum(src.count) AS count 
  FROM   product p 
         LEFT JOIN src 
                ON src.product REGEXP p.product 
  GROUP  BY p.product
; 
SELECT * FROM inventory
;
--
product count
apple   170
avocado 70
grapes  50
mango   100
orange  120

voila

Upvotes: 0

spencer7593
spencer7593

Reputation: 108510

Assuming that Product is a character column, and a "comma separated list" of values is being stored in it, the SQL to achieve the specified result is cumbersome.

SQL isn't designed for splitting out strings in comma separated lists into separate rows. This table design flies in the face of best practice relational database design principles.

I strongly recommend Bill Karwin's excellent book "SQL Antipatterns: Avoiding the Pitfalls of Database Programming". Chapter 2 "Jaywalking" is currently available in the "look inside" feature of Amazon...

https://www.amazon.com/SQL-Antipatterns-Programming-Pragmatic-Programmers/dp/1934356557


However, to answer the question you asked. It is possible to achieve the specified result. This will work for the example case, but not necessarily other more general cases:

SELECT REPLACE(
       REPLACE(
       REPLACE(
       REPLACE(
       REPLACE(
       REPLACE( c.Product
        ,'Apples','Apple'
       ),'apple','Apple'
       ),'orange','Orange'
       ),'mango','Mango'
       ),'grapes','Grapes'
       ),'avocado','Avocado'
       ) AS `Product`
     , SUM(c.Count) AS `Count`
  FROM ( SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(d.Product,',',n.i),',',-1)) AS `Product`
              , d.Count
           FROM ( SELECT 1 AS i UNION ALL SELECT 2 UNION ALL SELECT 3 ) n
          CROSS
           JOIN ( -- table of example data
                  SELECT 100 AS `Count`, 'Apples, orange, mango' AS `Product` 
                  UNION ALL SELECT 50, 'Apples, grapes, avocado'
                  UNION ALL SELECT 20, 'Orange, apple, avocado'

                ) d 
       ) c
 GROUP
    BY REPLACE(
       REPLACE(
       REPLACE(
       REPLACE(
       REPLACE(
       REPLACE( c.Product
        ,'Apples','Apple'
       ),'apple','Apple'
       ),'orange','Orange'
       ),'mango','Mango'
       ),'grapes','Grapes'
       ),'avocado','Avocado'
       )
 ORDER BY 2 DESC, 1 ASC

Returns:

Product  Count   
-------  --------
Apple    170     
Orange   120     
Mango    100     
Avocado  70      
Grapes   50      

This works for the example data, but won't for other possible data. (For example, if the comma separated list of products contains four items, or only two items.)


If you had a separate table of just individual Product to be returned... we could probably use a JOIN between that and the table shown in the question, and use a FIND_IN_SET type operation to do that matching. That would make the query a little bit simpler.

Upvotes: 2

Related Questions