Reputation: 651
I have a schema which is similar to
"id": "uuid",
"deviceId": "uuid",
"message": {
"content": "string",
"ts": 1
},
"data": {
"temperature": 21
}
I'd like to get the latest "data" (using message.ts as the timestamp) for each "deviceId".
So far, I've managed to get the data back, in order of timestamp using the query
SELECT c.deviceId, c.message.ts, c.data FROM c ORDER BY c.message.ts DESC
but I can't figure out how to remove the duplicate device records.
Is this possible to do within the CosmosDB SQL Engine?
Upvotes: 3
Views: 7447
Reputation: 231
I agree with Steve Johnson's answer, but I think, at the time the answer was given Cosmos might not be supporting Joins, as of today it does.
You need to run 2 queries to do the same.
First, Run a query to get the max of every c.message.ts per device Id just like Steve Johnson's answer.
SELECT c.deviceId, max(c.message.ts) as lastest FROM c group by c.deviceId
Then, use the response from the above query with JOIN in Cosmos DB.
SELECT s.id
FROM
<Container_name> AS s
JOIN m IN (
SELECT VALUE [
{deviceId: 'uuid1', lastest: '1'},
{deviceId: 'uuid2', lastest: '2'}
]
)
WHERE
s.deviceId = m.deviceId
and s.message.ts = m.lastest
Learn more here, https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/query/subquery
Upvotes: 0
Reputation: 11
Another route you could take is to use a trigger function in CosmosDb. The benefit of this would be that you don't need to deploy an Azure function and you can just use SQL to get the most recent item. For example, when you get a new item, you can use a pre-trigger to set a field as follows: latest = true
, and at the same time change the previous most recent item's latest field to false. Then your SQL query would simply need WHERE latest = true
to return the most recent record for every item.
Here is a trigger function with the general idea:
function setLatest() {
var context = getContext();
var request = context.getRequest();
// item to be created in the current operation
var itemToCreate = request.getBody();
// validate properties
if (!("latest" in itemToCreate)) {
itemToCreate["latest"] = true;
}
// update the old latest to false
removeCurrentLatest(itemToCreate["id"],..., );
// save the newer item that will be created
request.setBody(itemToCreate);
}
function removeCurrentLatest(id, ...) {
var collection = getContext().getCollection();
var collectionLink = collection.getSelfLink();
// Get the document. We keep it in the same collection.
var isAccepted = collection.queryDocuments
(collectionLink, `SELECT * FROM root r WHERE r.id = "${id}" AND .... AND r.latest = true`,
function (err, feed, options) {
if (err) throw err;
if (feed && feed[0] != null)
{
var oldDoc = feed[0];
oldDoc.latest = false;
var isAccepted = collection.replaceDocument(oldDoc._self, oldDoc, function (err) {
if (err) throw err;
});
if (!isAccepted) throw new Error("The call replaceDocument(oldDoc) returned false.");
}
});
if (!isAccepted) throw new Error("The call queryDocuments for oldDoc returned false.");
}
I've removed some of the other conditions you may want to include to ensure you select the correct previous version of the item - hopefully it should be obvious how to add those for your specific asset. This is an excellent post on using triggers in CosmosDB: http://www.johndowns.co.nz/blog/2018/1/30/cosmos-db-server-side-programming-with-typescript-part-4-triggers
Upvotes: 1
Reputation: 651
Thanks to Mark Brown's comment, I found the following which seems to be the correct solution to this problem. Not as elegant as just using some SQL for a one-off but is really what was needed.
In essence, you create a Serverless Function which is triggered by the Cosmos change feed and updates a materialized view, which is essentially just a document with (in this case) the most up to date data
per deviceId
.
Specifically for this case, it'll most likely update the corresponding device
document with it's most recent data.
Upvotes: 0
Reputation: 8660
It is impossible to achieve this with one SQL by now.
May be this can be an alternative.
First, run this SQL SELECT c.deviceId,max(c.message.ts) as lastest FROM c group by c.deviceId
.
Then, you can get data by this SQL, SELECT * FROM c WHERE c.deviceId = 'xxx' AND c.message.ts = xxxx
Upvotes: 0