Reputation: 2305
I have a table with example data as shown below.
word | last_seen | first_seen | count
-----------|------------|------------|------
definition | 2014-09-08 | 2012-01-02 | 15
definition | 2014-10-11 | 2013-05-12 | 35
attribute | 2013-07-23 | 2010-06-29 | 22
I'm wanting to to an in-place aggregation of the data, hopefully just using SQL, where the data for repeated words is such that I end up with MAX(last_seen)
, MIN(first_seen)
, and SUM(count)
.
word | last_seen | first_seen | count
-----------|------------|------------|------
definition | 2014-10-11 | 2012-01-02 | 50
attribute | 2013-07-23 | 2010-06-29 | 22
I know I can see the results of the aggregation with the following:
SELECT
word,
MAX(last_seen) AS last_seen,
MIN(first_seen) AS first_seen,
SUM(count) AS count
FROM
words
GROUP BY word;
However, I don't just want to see the resulting aggregation... I want to actually update the words
table, replacing the rows that have duplicate word
column entries with the aggregated data.
Upvotes: 0
Views: 1464
Reputation: 657912
Actually you can do that in a single statement using data-modifying CTEs:
WITH del AS (
DELETE FROM words w
WHERE EXISTS (
SELECT 1
FROM words w1
WHERE w1.word = w.word
AND w1.ctid <> w.ctid
)
RETURNING *
)
INSERT INTO words(word, last_seen, first_seen, count)
SELECT word, MAX(last_seen), MIN(first_seen), SUM(count)
FROM del
GROUP BY word;
Should be rather efficient, too.
About ctid
:
About CTEs:
Upvotes: 0
Reputation: 50200
As far as I'm aware there is no "Edit in place" in Postgresql (or any other traditional RDBMS that I can think of). Instead:
CREATE TEMP TABLE <temptable> AS <Your Query> WITH DATA
word
table: TRUNCATE word;
<--This is the scary part so make sure you are cool with your query before truncating.word
table: INSERT INTO word SELECT * FROM <temptable>;
DROP TABLE <temptable>;
(being a temp table it will drop automagically when you end your session, but I'm a fan of being explicit)Upvotes: 1