Reputation: 2777
My table stores revision data for my CMS entries. Each entry has an ID and a revision date, and there are multiple revisions:
Table: old_revisions
+----------+---------------+-----------------------------------------+
| entry_id | revision_date | entry_data |
+----------+---------------+-----------------------------------------+
| 1 | 1302150011 | I like pie. |
| 1 | 1302148411 | I like pie and cookies. |
| 1 | 1302149885 | I like pie and cookies and cake. |
| 2 | 1288917372 | Kittens are cute. |
| 2 | 1288918782 | Kittens are cute but puppies are cuter. |
| 3 | 1288056095 | Han shot first. |
+----------+---------------+-----------------------------------------+
I want to transfer some of this data to another table:
Table: new_revisions
+--------------+----------------+
| new_entry_id | new_entry_data |
+--------------+----------------+
| | |
+--------------+----------------+
I want to transfer entry_id
and entry_data
to new_entry_id
and new_entry_data
. But I only want to transfer the most recent version of each entry.
I got as far as this query:
INSERT INTO new_revisions (
new_entry_id,
new_entry_data
)
SELECT
entry_id,
entry_data,
MAX(revision_date)
FROM old_revisions
GROUP BY entry_id
But I think the problem is that I'm trying to insert 3 columns of data into 2 columns.
How do I transfer the data based on the revision date without transferring the revision date as well?
Upvotes: 0
Views: 84
Reputation: 247810
You can use the following query:
insert into new_revisions (new_entry_id, new_entry_data)
select o1.entry_id, o1.entry_data
from old_revisions o1
inner join
(
select max(revision_date) maxDate, entry_id
from old_revisions
group by entry_id
) o2
on o1.entry_id = o2.entry_id
and o1.revision_date = o2.maxDate
See SQL Fiddle with Demo. This query gets the max(revision_date)
for each entry_id
and then joins back to your table on both the entry_id
and the max date to get the rows to be inserted.
Please note that the subquery is only returning the entry_id
and date, this is because we want to apply the GROUP BY
to the items in the select list that are not in an aggregate function. MySQL uses an extension to the GROUP BY
clause that allows columns in the select list to be excluded in a group by and aggregate but this could causes unexpected results. By only including the columns needed by the aggregate and the group by will ensure that the result is the value you want. (see MySQL Extensions to GROUP BY)
From the MySQL Docs:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. ... You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.
Upvotes: 3
Reputation: 21047
If you want to enter the last entry you need to filter it before:
select entry_id, max(revision_date) as maxDate
from old_revisions
group by entry_id;
Then use this as a subquery to filter the data you need:
insert into new_revisions (new_entry_id, new_entry_data)
select entry_id, entry_data
from old_revisions as o
inner join (
select entry_id, max(revision_date) as maxDate
from old_revisions
group by entry_id
) as a on o.entry_id = a.entry_id and o.revision_date = a.maxDate
Upvotes: 2