cerhovice
cerhovice

Reputation: 686

Find the last item in a JSON array in Postgres 9.4

We had a legacy system that attempted to keep track of all versions of data saved of a particular document. We originally stored our JSON as a string in some old version of Postgres, but recently we upgraded to Postgres 9.3 and we started using the JSON column type.

We had a column called "versions", and it held an an array, and each saved version of a particular document was stored in the array, so a query like this:

SELECT _data_as_json FROM measurements WHERE id = 3307551

returned JSON like this:

 {"reports": {}, "versions": [
 {"timestamp": "2014-04-28T19:12:31.567415", "user": 11327, "legacy": {}, "vd_version": 1}, 
 {"timestamp": "2014-05-12T18:03:24.417029", "user": 11331, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-05-12T21:52:50.045758", "user": 10373, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-05-14T23:34:37.797822", "user": 10380, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-07-16T14:56:38.667363", "user": 10374, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-07-16T14:57:47.341541", "user": 10374, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-07-17T16:32:09.067026", "user": 11331, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-09-11T14:35:44.436886", "user": 11331, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-10-15T14:30:50.554932", "user": 10383, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-10-29T15:36:35.183787", "user": 11331, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-11-12T22:22:03.892484", "user": 10373, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}
 ]}     

We (attempted to) stored the data in "versions" in chronological order, but 99% of the time, we only need the last document. In Postgres 9.3, we came up with this query to get the last item:

SELECT json_array_elements(_data_as_json->'versions')
FROM measurements
WHERE id = 3307551
LIMIT 1 OFFSET (SELECT json_array_length(_data_as_json->'versions') - 1 FROM measurements WHERE id = 3307551)

This basically works, but it is a little fragile. If we ever fail to order things correctly in the versions array, then we get back the wrong version of the document. I'm curious if there is a better way to do this? I have read that Postgres 9.4 offers more functions for dealing with JSON.

Ideally, we could do ORDER BY on the "timestamp". Is that possible?

Upvotes: 19

Views: 16598

Answers (1)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 659367

Postgres 9.5+

The job is simple now since, quoting the manual:

The field/element/path extraction operators that accept integer JSON array subscripts all support negative subscripting from the end of arrays.

Bold emphasis mine. So for either json or jsonb:

SELECT data->'versions'->>-1
FROM   measurements m
WHERE  id = 3307551;

Postgres 9.4

You may want to use jsonb instead of json. Use jsonb_array_elements() or jsonb_array_length() accordingly.

There is a general approach to get the last element according to original sort order using WITH ORDINALITY (slightly slower without):

SELECT v.ver
FROM   measurements m
     , jsonb_array_elements(m.data->'versions') WITH ORDINALITY v(ver, ord)
WHERE  m.id = 3307551
ORDER  BY v.ord DESC
LIMIT  1;

Details for WITH ORDINALITY (and the implicit JOIN LATERAL in both versions):

Postgres 9.3

"last" according to the timestamp value:

SELECT v.ver
FROM   measurements m
     , json_array_elements(m.data->'versions') v(ver)
WHERE  m.id = 3307551
ORDER  BY  (v.ver->>'timestamp')::timestamp DESC
LIMIT  1;

"last" according to ordinal position in the json array (faster):

SELECT data->'versions'->(json_array_length(data->'versions') - 1)
FROM   measurements
WHERE  id = 3307551;

We need - 1 because JSON arrays start at offset 0.

db<>fiddle here
Old sqlfiddle

Upvotes: 44

Related Questions