PHPLover
PHPLover

Reputation: 12957

How to print the store name having maximum no. of sales in a particular month and year?

I've two tables in my database(MySQL database) as follows :

Table No.1 : stores
store_id(pk)
store_name

Table No.2 : sales
sale_id(pk)
store_id(fk)
sale_date
sale_amt

In Sales table, same store can have multiple entries of sale for a particular month.

Now I want to print the name of a store which is having most no. of entries for the month and year entered by user.

How should I achieve this in an efficient and reliable way?

Upvotes: 2

Views: 1354

Answers (7)

Jason Heo
Jason Heo

Reputation: 10246

To get it efficiently, get top 1 store_id before INNER JOIN. This prevents from joining one-to-many relations.

SELECT store_name
FROM stores
    INNER JOIN
    (
        SELECT store_id, COUNT(*) AS cnt
        FROM sales
        WHERE sale_date BETWEEN '2015-09-01' AND '2015-09-30'
        GROUP BY store_id
        ORDER BY cnt DESC
        LIMIT 1
    ) max_sales ON stores.store_id = max_sales.store_id;

Assuming, at least following index exists:

CREATE INDEX idx1 ON sales (sale_date, store_id);

Upvotes: 0

Rick James
Rick James

Reputation: 142356

Doesn't the question require SUM(sale_amt)?

SELECT  
      ( SELECT  store_name
            FROM  Stores
            WHERE  store_id = s.store_id 
      ) AS StoreName
    FROM  Sales s
    WHERE  sale_date >= '2015-09-01'
      AND  sale_date  < '2015-09-01' + INTERVAL 1 MONTH
    GROUP BY  store_id
    ORDER BY  SUM(sale_amt) DESC
    LIMIT  1;

(does not deliver ties)

Upvotes: 0

an3sarmiento
an3sarmiento

Reputation: 343

Try to Group by the stores and count how many transactions did each store made (Table A). Then get the maximum number of transactions from table A, you will obtain a number (e.g. value B). Next you can copy the SQL that you used to obtain value B and put it in a Having clause from table A, something like:

This is table A

select count(store_id) as sid
from sales
group by store_id

This is value B

select max(sid)
from( select count(store_id) as sid
    from sales
    group by store_id ) t1

This is the id of the store with the maximum number of transactions

select store_name, count(store_name)
from sales
group by store_name
having count(store_name) =  (select max(sid)
                                from( select count(store_id) as sid
                                        from sales
                                        group by store_id ) t1)

Yo can add the year and month constraint in the where clause and join it with the stores table to get the store_name

select store_name, count(store_name)
from sales, stores
where sales.store_id = stores.store_id
and year(sale_date) = '2015'
and month(sale_date) = '09'
group by store_id, store_name
having count(store_name) =  (select max(sid)
                                from( select count(store_id) as sid
                                        from sales
                                        where year(sale_date) = '2015'
                                        and month(sale_date) = '09'
                                        group by store_id ) t1)

Please notice that, other solutions that rearrange the stores by the number of sales and limit the solution to the first register may be wrong as maybe there are more than one store with the same number of transactions. This Query will return all the stores that have that number of sales.

Pd: you can try this solution here.

Regards,

Upvotes: 3

userlond
userlond

Reputation: 3828

Online demo

See my fiddle.

Result query

SELECT s.*, COUNT(sl.store_id) AS number_of_sales        
FROM stores s
  JOIN sales sl ON 
    s.store_id = sl.store_id 
    AND sl.sale_date BETWEEN '2015-09-01' AND '2015-09-30'
GROUP BY s.store_id
ORDER BY number_of_sales DESC
LIMIT 1

Total listing with some notes

CREATE TABLE stores (
  store_id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
  store_name VARCHAR(255) NOT NULL
);

CREATE TABLE sales (
  sale_id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
  store_id INT(11) NOT NULL REFERENCES stores(store_id),
  sale_date DATE NOT NULL,
  sale_amt INT(11) NOT NULL
);

/* add index on sale_date for faster search by dates */
/* if you use myisam, add index on store_id too */
ALTER TABLE sales ADD INDEX (sale_date);

/* test data */
INSERT INTO stores VALUES(1, 'shop1');
INSERT INTO stores VALUES(2, 'shop2');
INSERT INTO stores VALUES(3, 'shop3');

INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (40, 2, '2015-09-22', 31);
INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (41, 2, '2015-09-30', 74);
INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (42, 1, '2015-01-16', 212);
INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (43, 3, '2015-09-15', 113);
INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (44, 1, '2015-09-11', 61);
INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (45, 1, '2015-09-15', 49);
INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (46, 1, '2015-05-14', 28);
INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (47, 3, '2014-12-23', 102);
INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (48, 1, '2015-09-19', 101);
INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (49, 2, '2015-09-24', 131);
INSERT INTO sales (sale_id, store_id, sale_date, sale_amt) VALUES (50, 1, '2015-09-13', 155);

/* 
select shop with max number of entries in 2015/April 

Shop1 has 4 sales in 2015/April
Shop2 has 3 sales in 2015/April
Shop3 has 1 sale in 2015/April
*/
SELECT s.*, COUNT(sl.store_id) AS number_of_sales        
FROM stores s
  JOIN sales sl ON 
    s.store_id = sl.store_id 
    AND sl.sale_date BETWEEN '2015-09-01' AND '2015-09-30'
GROUP BY s.store_id
ORDER BY number_of_sales DESC
LIMIT 1

Explanation

At first, query selects sales for each shop, which really have sales via JOIN.

At second, we count number of sales foreach shop via GROUP BY.

At third, we sort (via ORDER BY) grouped results by number of entries desc AND select one (via LIMIT) with highest value.

P.S. You may compare aforementioned query with queries from other answers and tell us result :)

Benchmarks

I'v generated 1000 stores and 10 million sales. Tested on localhost Windows NT, 5.5.25a-log - MySQL Community Server (GPL), default my-large.ini config. Table type is INNODB.

Avg exe time from 3 launches with SQL_NO_CACHE directive.

an3sarmiento's solution raises syntax error.

select SQL_NO_CACHE store_id, count(*)
from sales
group by store_id
having count(*) = (select max(count(*))
                    from sales
                    group by store_id)

#1111 - Invalid use of group function

zedfoxus's solution executes 8.8 seconds

select SQL_NO_CACHE store_name
from stores st
inner join sales sa on st.store_id = sa.store_id
where year(sale_date) = '2015'
  and month(sale_date) = '09'
group by store_name
order by count(*) desc

limit 1;

Special note for this solution. If different shops have equal names, their results grouped as results of one shop (because of group by store_name statement).

Innos Heo's solution raises syntax error

SELECT SQL_NO_CACHE store_name
FROM stores
    INNER JOIN
    (
        SELECT store_id, COUNT(*) AS cnt
        FROM sales
        GROUP BY store_id
        WHERE sale_date BETWEEN '2015-09-01' AND '2015-09-30'
        ORDER BY cnt
        LIMIT 1
    ) max_sales ON stores.store_id = max_sales.store_id;

#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE sale_date BETWEEN '2015-09-01' AND '2015-09-30' ORDER BY cnt ' at line 8

My aforementioned solution executed 143 seconds.

But it distiguishes shops with the same name, but different ids.

Epilog

If you work with big data and want max performance then precalc. Structure you db levels. First will store all data, second will answer to users' queries.

For example, you may create table, which stores amount of sales for each shop by year and month. To keep data up-to-date you may use triggers.

If you data isn't really big, don't play in optimisation. Everything is for usecase.

Upvotes: 2

Tarun Jain
Tarun Jain

Reputation: 53

You can use this query

SELECT store_name
FROM stores
WHERE store_id = (
    SELECT store_id
    FROM sales
    WHERE YEAR(sale_date) = '<year value>'
    AND MONTH(sale_date) = '<month value>'
    GROUP BY store_id
    ORDER BY COUNT(*) DESC
    LIMIT 1
)

This will give the required result and it will be faster than joins if data is more in database tables.

Upvotes: 0

Amit Ramoliya
Amit Ramoliya

Reputation: 392

Please try this query. I have checked it by creating your demo database.

 SELECT COUNT(*) AS total, 
           st.store_name, sl.store_id, sl.sale_date, sl.sale_amt, 
           YEAR(sl.sale_date) as year, 
           MONTH(sl.sale_date) as month 
    FROM stores AS st 
    JOIN sales AS sl ON st.store_id = sl.store_id 
    WHERE YEAR(sl.sale_date)='2015' and MONTH(sl.sale_date)='12' 
    GROUP BY st.store_id
    ORDER BY total
    DESC LIMIT 1

Upvotes: 1

zedfoxus
zedfoxus

Reputation: 37099

You could give this a shot:

select store_name
from stores st
inner join sales sa on st.store_id = sa.store_id
where year(sale_date) = <year entered by user>
  and month(sale_date) = <month entered by user>
group by store_name
order by count(*) desc
limit 1;

You might benefit from creating a couple covering indexes like so:

create index idx_stores_id_name on stores(store_id, store_name);
create index idx_sales_storeid_saledate on sales (store_id, sale_date);

Upvotes: 0

Related Questions