Reputation: 21945
I have stock quantity information in my database.
1 table, "stock", holds the productid (sku) along with the quantity and the filename from where it came.
The other table, "stockfile", contains all the processed filenames along with dates.
Now I need to get all the products with their latest stock quantity values.
This gives me ALL the products multiple times with all their stock quantity (resulting in 300.000 records)
SELECT stock.stockid, stock.sku, stock.quantity, stockfile.filename, stockfile.date
FROM stock
INNER JOIN stockfile ON stock.stockfileid = stockfile.stockfileid
ORDER BYstock
.sku
ASC
I already tried this:
SELECT * FROM stock
INNER JOIN stockfile ON stock.stockfileid = stockfile.stockfileid
GROUP BY sku
HAVING stockfile.date = MAX( stockfile.date )
ORDER BYstock
.sku
ASC
But it did not work
SHOW CREATE TABLE stock:
CREATE TABLE
stock
(
stockid
bigint(20) NOT NULL AUTO_INCREMENT,
sku
char(25) NOT NULL,
quantity
int(5) NOT NULL,
creationdate
datetime NOT NULL,
stockfileid
smallint(5) unsigned NOT NULL,
touchdate
datetime NOT NULL,
PRIMARY KEY (stockid
)
) ENGINE=MyISAM AUTO_INCREMENT=315169 DEFAULT CHARSET=latin1
SHOW CREATE TABLE stockfile:
CREATE TABLE
stockfile
(
stockfileid
smallint(5) unsigned NOT NULL AUTO_INCREMENT,
filename
varchar(25) NOT NULL,
creationdate
datetime DEFAULT NULL,
touchdate
datetime DEFAULT NULL,
date
datetime DEFAULT NULL,
begindate
datetime DEFAULT NULL,
enddate
datetime DEFAULT NULL,
PRIMARY KEY (stockfileid
)
) ENGINE=MyISAM AUTO_INCREMENT=265 DEFAULT CHARSET=latin1
Upvotes: 1
Views: 3082
Reputation: 562328
This is an example of the frequently-asked "greatest-n-per-group" question that we see every week on StackOverflow. Follow that tag to see other similar solutions.
SELECT s.*, f1.*
FROM stock s
INNER JOIN stockfile f1
ON (s.stockfileid = f1.stockfileid)
LEFT OUTER JOIN stockfile f2
ON (s.stockfileid = f2.stockfileid AND f1.date < f2.date)
WHERE f2.stockfileid IS NULL;
If there are multiple rows in stockfile
that have the max date, you'll get them both in the result set. To resolve this, you'd have to add some tie-breaker conditions into the join on f2
.
Thanks for adding the CREATE TABLE
info. That's very helpful when you're asking SQL questions.
I see from the AUTO_INCREMENT
table options that you have 315k rows in stock
and only 265 rows in stockfile
. Your stockfile
table is the parent in the relationship, and the stock
table is the child, with a column stockfileid
that references the primary key of stockfile
.
So your original question was misleading. You want the latest row from stock
, not the latest row from stockfile
.
SELECT f.*, s1.*
FROM stockfile f
INNER JOIN stock s1
ON (f.stockfileid = s1.stockfileid)
LEFT OUTER JOIN stock s2
ON (f.stockfileid = s2.stockfileid AND (s1.touchdate < s2.touchdate
OR s1.touchdate = s2.touchdate AND s1.stockid < s2.stockid))
WHERE s2.stockid IS NULL;
I'm assuming you want "latest" to be relative to touchdate
, so if you want to use creationdate
instead, you can do the edit.
I've added a term to the join so that it resolves ties. I know you said the dates are "practically unique" but as the saying goes, "one in a million is next Tuesday."
Okay, I think I understand what you're trying to do now. You want the most recent row per sku
, but the date
by which to compare them is in the referenced table stockfile
.
SELECT s1.*, f1.*
FROM stock s1
JOIN stockfile f1 ON (s1.stockfileid = f1.stockfileid)
LEFT OUTER JOIN (stock s2 JOIN stockfile f2 ON (s2.stockfileid = f2.stockfileid))
ON (s1.sku = s2.sku AND (f1.date < f2.date OR f1.date = f2.date AND f1.stockfileid < f2.stockfileid))
WHERE s2.sku IS NULL;
This does a self-join of stock
to itself, looking for a row with the same sku
and a more recent date
. When none is found, then s1
contains the most recent row for its sku
. And each instance of stock
has to join to its stockfile
to get the date
.
Re comment about optimization: It's hard for me to test because I don't have tables populated with data matching yours, but I'd guess you should have the following indexes:
CREATE INDEX stock_sku ON stock(sku);
CREATE INDEX stock_stockfileid ON stock(stockfileid);
CREATE INDEX stockfile_date ON stockfile(date);
I'd suggest using EXPLAIN
to analyze the query without the indexes, and then create one index at a time and re-analyze with EXPLAIN
to see which one gives the most direct benefit.
Upvotes: 6
Reputation: 1167
There are two common ways to accomplish this: a sub query or a self-join.
See this example of selecting the group-wise maximum at the MySQL site.
Edit, an example using a subquery:
SELECT stock.stockid, stock.sku, stock.quantity,
stockfile.filename, stockfile.date
FROM stock
INNER JOIN stockfile ON stock.stockfileid = stockfile.stockfileid
WHERE stockfile.date = (SELECT MAX(date) FROM stockfile);
Upvotes: 0
Reputation: 332571
Use:
SELECT DISTINCT s.stockid,
s.sku,
s.quantity,
sf.filename,
sf.date
FROM STOCK s
JOIN STOCKFILE sf ON sf.stockfileid = s.stockfileid
JOIN (SELECT t.stockfileid,
MAX(t.date) 'max_date'
FROM STOCKFILE t
GROUP BY t.stockfileid) x ON x.stockfileid = sf.stockfileid
AND x.max_date = sf.date
Upvotes: 2
Reputation: 3123
select *
from stock
where stockfileid in (
select top 1 stockfileid
from stockfile
order by date desc
)
Upvotes: 0