Reputation: 978
I am trying to fetch the first and the last record of a 'grouped' record.
More precisely, I am doing a query like this
SELECT MIN(low_price), MAX(high_price), open, close
FROM symbols
WHERE date BETWEEN(.. ..)
GROUP BY YEARWEEK(date)
but I'd like to get the first and the last record of the group. It could by done by doing tons of requests but I have a quite large table.
Is there a (low processing time if possible) way to do this with MySQL?
Upvotes: 41
Views: 103161
Reputation: 71
I usually achieve this by joins back onto the table as this gives me access to all data for the two rows.
This example use order by and limit, but you can also use min and max on the primary key returned in the subqueries of the joins.
This is assuming that table has a primary key column called ID.
SELECT MIN(symbols.low_price), MAX(symbols.high_price), symbols.open, symbols.close,
symbols.id,
symbols.date,
symbols_prev.id symbols_prev_id,
symbols_prev.date symbols_prev_date,
symbols_prev.low_price symbols_prev_low_price,
symbols_prev.high_price symbols_prev_high_price,
symbols_next.id symbols_next_id,
symbols_next.date symbols_next_date,
symbols_next.low_price symbols_next_low_price,
symbols_next.high_price symbols_next_high_price
FROM symbols
JOIN symbols symbols_prev ON
symbols_prev.ID =
(
SELECT symbols_prev_inner.ID
FROM symbols symbols_prev_inner
WHERE YEARWEEK(symbols_prev_inner.date)=YEARWEEK(symbols.date)
AND symbols_prev_inner.ID<symbols.ID
ORDER BY
symbols_prev_inner.ID DESC
LIMIT 1
)
JOIN symbols symbols_next ON
symbols_next.ID =
(
SELECT symbols_next_inner.ID
FROM symbols symbols_next_inner
WHERE YEARWEEK(symbols_next_inner.date)=YEARWEEK(symbols.date)
AND symbols_next_inner.ID>symbols.ID
ORDER BY
symbols_next_inner.ID
LIMIT 1
)
WHERE symbols.date BETWEEN(.. ..)
GROUP BY YEARWEEK(symbols.date)
Upvotes: 0
Reputation: 2833
You want to use GROUP_CONCAT
and SUBSTRING_INDEX
:
SUBSTRING_INDEX( GROUP_CONCAT(CAST(open AS CHAR) ORDER BY datetime), ',', 1 ) AS open
SUBSTRING_INDEX( GROUP_CONCAT(CAST(close AS CHAR) ORDER BY datetime DESC), ',', 1 ) AS close
This avoids expensive sub queries and I find it generally more efficient for this particular problem.
Check out the manual pages for both functions to understand their arguments, or visit this article which includes an example of how to do timeframe conversion in MySQL for more explanations.
Upvotes: 67
Reputation: 1
Here is a great specific solution to this specific problem: http://topwebguy.com/first-and-last-in-mysql-a-working-solution/ It's almost as simple as using FIRST and LAST in MySQL.
I will include the code that actually provides the solution but you can look upi the whole text:
SELECT
word ,
(SELECT a.ip_addr FROM article a
WHERE a.word = article.word
ORDER BY a.updated LIMIT 1) AS first_ip,
(SELECT a.ip_addr FROM article a
WHERE a.word = article.word
ORDER BY a.updated DESC LIMIT 1) AS last_ip
FROM notfound GROUP BY word;
Upvotes: 0
Reputation: 475
Assuming that you want the ids of the records with the lowest low_price and the highest high_price you could add these two columns to your query,
SELECT
(SELECT id ORDER BY low_price ASC LIMIT 1) low_price_id,
(SELECT id ORDER BY high_price DESC LIMIT 1) high_price_id,
MIN(low_price), MAX(high_price), open, close
FROM symbols
WHERE date BETWEEN(.. ..)
GROUP BY YEARWEEK(date)
If efficiency is an issue you should add a column for 'year_week', add some covering indexes, and split the query in two.
The 'year_week' column is just an INT set to the value of YEARWEEK(date) and updated whenever the 'date' column is updated. This way you don't have to recalculate it for each query and you can index it.
The new covering indexes should look like this. The ordering is important. KEY yw_lp_id (year_week, low_price, id), KEY yw_hp_id (year_week, high_price, id)
You should then use these two queries
SELECT
(SELECT id ORDER BY low_price ASC LIMIT 1) low_price_id,
MIN(low_price), open, close
FROM symbols
WHERE year_week BETWEEN(.. ..)
GROUP BY year_week
and
SELECT
(SELECT id ORDER BY high_price DESC LIMIT 1) high_price_id,
MAX(high_price), open, close
FROM symbols
WHERE year_week BETWEEN(.. ..)
GROUP BY year_week
Covering indexes are pretty useful. Check this out for more details.
Upvotes: -1
Reputation: 146607
Try This to start with... :
Select YearWeek, Date, Min(Low_Price), Max(High_Price)
From
(Select YEARWEEK(date) YearWeek, Date, LowPrice, High_Price
From Symbols S
Where Date BETWEEN(.. ..)
GROUP BY YEARWEEK(date)) Z
Group By YearWeek, Date
Upvotes: 2