Bryan
Bryan

Reputation: 2305

How can I aggregate and collapse rows in a database table using SQL?

I have a table with example data as shown below.

word       | last_seen  | first_seen | count
-----------|------------|------------|------
definition | 2014-09-08 | 2012-01-02 | 15
definition | 2014-10-11 | 2013-05-12 | 35
attribute  | 2013-07-23 | 2010-06-29 | 22

I'm wanting to to an in-place aggregation of the data, hopefully just using SQL, where the data for repeated words is such that I end up with MAX(last_seen), MIN(first_seen), and SUM(count).

word       | last_seen  | first_seen | count
-----------|------------|------------|------
definition | 2014-10-11 | 2012-01-02 | 50
attribute  | 2013-07-23 | 2010-06-29 | 22

I know I can see the results of the aggregation with the following:

SELECT 
  word, 
  MAX(last_seen) AS last_seen, 
  MIN(first_seen) AS first_seen, 
  SUM(count) AS count 
FROM 
  words 
GROUP BY word;

However, I don't just want to see the resulting aggregation... I want to actually update the words table, replacing the rows that have duplicate word column entries with the aggregated data.

Upvotes: 0

Views: 1464

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 657912

Actually you can do that in a single statement using data-modifying CTEs:

WITH del AS (
   DELETE FROM words w
   WHERE EXISTS (
      SELECT 1 
      FROM   words w1
      WHERE  w1.word = w.word
      AND    w1.ctid <> w.ctid
     )
   RETURNING *
   )
INSERT INTO words(word, last_seen, first_seen, count)
SELECT word, MAX(last_seen), MIN(first_seen), SUM(count)
FROM   del
GROUP  BY word;

Should be rather efficient, too.

SQL Fiddle.

About ctid:

About CTEs:

Upvotes: 0

JNevill
JNevill

Reputation: 50200

As far as I'm aware there is no "Edit in place" in Postgresql (or any other traditional RDBMS that I can think of). Instead:

  1. Take the results of your query and dump them into a temp table: CREATE TEMP TABLE <temptable> AS <Your Query> WITH DATA
  2. Delete out everything in your word table: TRUNCATE word; <--This is the scary part so make sure you are cool with your query before truncating.
  3. Insert the records in your temp table into the now empty word table: INSERT INTO word SELECT * FROM <temptable>;
  4. Optionally: Drop your temp table DROP TABLE <temptable>; (being a temp table it will drop automagically when you end your session, but I'm a fan of being explicit)

Upvotes: 1

Related Questions