Philipp Schmid
Philipp Schmid

Reputation: 5828

Version Control Algorithm

I have a database where I store objects. I have the following (simplified) schema

CREATE TABLE MyObjects
(
  UniqueIdentifier Id;
  BigInt           GenerationId;
  BigInt           Value;
  Bit              DeleteAction;
)

Each object has a unique identifier ("Id"), and a (set of) property ("Value"). Each time the value of the property for an object is changed, I enter a new row into this table with a new generation id ("GenerationId", which is monotonically increasing). If an object is deleted, then I record this fact by setting the "DeleteAction" bit to true.

At any point in time (generation), I would like to retrieve the state of all of my active objects!

Here's an example:

Id    GenerationId Value  DeleteAction
1        1          99       false
2        1          88       false
1        2          77       false
2        3          88       true

Objects in generations are:

  1: 1 {99}, 2 {88}
  2: 1 {77}, 2 {88}
  3: 1 {77}

The key is: how can I find out the row for each unique object who's generation id is closest (but not exceeding) to a given generation id? I can then do a post-filter step to remove all rows where the DeleteAction field is true.

Upvotes: 3

Views: 830

Answers (4)

Philipp Schmid
Philipp Schmid

Reputation: 5828

Here's the working version:

SELECT MyObjects.Id,Value
FROM Myobjects
INNER JOIN 
(      
  SELECT Id, max(GenerationId) as LastGen
  FROM MyObjects
  WHERE GenerationId <= @TargetGeneration
  Group by Id
) T1
ON MyObjects.Id = T1.Id AND MyObjects.GenerationId = LastGen
WHERE DeleteAction = 'False'

Upvotes: 1

bortzmeyer
bortzmeyer

Reputation: 35459

My version uses a joint of the table MyObjects against a subset of itself, created by a subquery, and containing only the last generation for each object:

SELECT O.id,generation,value FROM 
     MyObjects O, 
     (SELECT id,max(generation) AS max_generation FROM MyObjects 
     WHERE generation <= $GENERATION_ID GROUP BY id) AS TheMax WHERE 
            TheMax.max_generation = generation AND O.deleted is False
     ORDER BY generation DESC;

In the above query, the GENERATION_ID is hardwired. A way to parametrize it is to write a function:

CREATE OR REPLACE FUNCTION generation_objects(INTEGER) RETURNS SETOF MyObjects AS
  'SELECT O.id,generation,value,deleted FROM 
       MyObjects O, 
       (SELECT id,max(generation) AS max_generation FROM MyObjects 
       WHERE generation <= $1 GROUP BY id) AS TheMax WHERE 
              TheMax.max_generation = generation AND O.deleted is False;'
  LANGUAGE SQL;

Now, it works. With this table:

> SELECT * FROM MyObjects;          
 id | generation | value | deleted 
----+------------+-------+---------
  1 |          1 |    99 | f
  2 |          2 |    88 | f
  1 |          3 |    77 | f
  2 |          4 |    88 | t
  3 |          5 |    33 | f
  4 |          6 |    22 | f
  3 |          7 |    11 | f
  2 |          8 |    11 | f

I get:

> SELECT * FROM generation_objects(1) ORDER by generation DESC;
 id | generation | value | deleted 
----+------------+-------+---------
  1 |          1 |    99 | f

> SELECT * FROM generation_objects(2) ORDER by generation DESC;
 id | generation | value | deleted 
----+------------+-------+---------
  2 |          2 |    88 | f
  1 |          1 |    99 | f

> SELECT * FROM generation_objects(3) ORDER by generation DESC;
 id | generation | value | deleted 
----+------------+-------+---------
  1 |          3 |    77 | f
  2 |          2 |    88 | f

And then, at the following generation, object 2 is deleted:

> SELECT * FROM generation_objects(4) ORDER by generation DESC;
 id | generation | value | deleted 
----+------------+-------+---------
  1 |          3 |    77 | f

Upvotes: 2

Johan Buret
Johan Buret

Reputation: 2634

This works in MS SQL

SELECT id,value
FROM Myobjects
INNER JOIN ( 
     SELECT id, max(GenerationID) as LastGen 
     FROM MyObjects
     WHERE GenerationID <= @Wantedgeneration
     Group by ID)
    On GenerationID = LastGen
WHERE DelectedAction = false

Upvotes: 4

Martin v. L&#246;wis
Martin v. L&#246;wis

Reputation: 127457

Not sure whether that's standard SQL, but in Postgres, you can use the LIMIT flag:

 select GenerationId,Value,DeleteAction from MyObjects 
    where Id=1 and GenerationId < 3 
    order by GenerationId
    limit 1;

Upvotes: 0

Related Questions