elolozone
elolozone

Reputation: 71

Find last record of each day

I store data about my power consumption, each minute there is a new record, here is an example:

{"date":1393156826114,"id":"5309d4cae4b0fbd904cc00e1","adco":"O","hchc":7267599,"hchp":10805900,"hhphc":"g","ptec":"c","iinst":13,"papp":3010,"imax":58,"optarif":"s","isousc":60,"motdetat":"Á"}

such that I have around 1440 records a day.

How can I get the last record of each day?


Note: I use mongodb in spring java, so I need a query like this:

Example to get all measures :

@Query("{ 'date' : { $gt : ?0 }}")
public List<Mesure> findByDateGreaterThan(Date date, Sort sort);

Upvotes: 7

Views: 3597

Answers (2)

Xavier Guihot
Xavier Guihot

Reputation: 61686

It's also possible to format timestamps in the group key as %Y-%m-%d (e.g. 2021-12-05) with dateToString:

// { timestamp: 1638697946000, value: "a" } <= 2021-12-05 9:52:26
// { timestamp: 1638686311000, value: "b" } <= 2021-12-05 6:38:31
// { timestamp: 1638859111000, value: "c" } <= 2021-12-07 6:38:31
db.collection.aggregate([

  { $sort: { timestamp: 1 } },
  // { timestamp: 1638686311000, value: "b" }
  // { timestamp: 1638697946000, value: "a" }
  // { timestamp: 1638859111000, value: "c" }

  { $group: {
    _id: { $dateToString: { date: { $toDate: "$timestamp" }, format: "%Y-%m-%d" } },
    last: { $last: "$$ROOT" }
  }},
  // { _id: "2021-12-07", last: { timestamp: 1638859111000, value: "c" } }
  // { _id: "2021-12-05", last: { timestamp: 1638697946000, value: "a" } }

  { $replaceWith: "$last" }
])
// { timestamp: 1638697946000, value: "a" } <= 2021-12-05 9:52:26
// { timestamp: 1638859111000, value: "c" } <= 2021-12-07 6:38:31

This:

  • first $sorts documents by chronological order of timestamps such that we can latter on pick newest documents based on their order.

  • then $groups documents by their %Y-%m-%d-formatted timestamps:

    • by first converting the timestamp into a datetime: { $toDate: "$timestamp" }
    • and then converting the associated datetime into a string only representing the year, month and day: { $dateToString: { date: ..., format: "%Y-%m-%d" } }
    • such that for each group (i.e. date), we can pick the $last (i.e. newest since chronologically sorted) matching document
      • and the pick is the whole document as represented by $$ROOT
  • finally cleans up the group result with a $replaceWith stage (alias for $replaceRoot).

Upvotes: 0

Neil Lunn
Neil Lunn

Reputation: 151132

A bit more modern than the original answer:

db.collection.aggregate([
  { "$sort": { "date": 1 } },
  { "$group": {
    "_id": {
      "$subtract": ["$date",{"$mod": ["$date",86400000]}]
    },
    "doc": { "$last": "$$ROOT" }
  }},
  { "$replaceRoot": { "newDocument": "$doc" } }
])

The same principle applies that you essentially $sort the collection and then $group on the required grouping key picking up the $last data from the grouping boundary.

Making things a bit clearer since the original writing is that you can use $$ROOT instead of specifying every document property, and of course the $replaceRoot stage allows you to restore that data fully as the original document form.

But the general solution is still $sort first, then $group on the common key that is required and keep the $last or $first depending on sort order occurrences from the grouping boundary for the properties that are required.

Also for BSON Dates as opposed to a timestamp value as in the question, see Group result by 15 minutes time interval in MongoDb for different approaches on how to accumulate for different time intervals actually using and returning BSON Date values.


Not quite sure what you are going for here but you could do this in aggregate if my understanding is right. So to get the last record for each day:

db.collection.aggregate([
    // Sort in date order  as ascending
    {"$sort": { "date": 1 } },

    // Date math converts to whole day
    {"$project": {
        "adco": 1,
        "hchc": 1,
        "hchp": 1,
        "hhphc": 1,
        "ptec": 1,
        "iinst": 1,
        "papp": 1,
        "imax": 1,
        "optarif": 1,
        "isousc": 1,
        "motdetat": 1,
        "date": 1,
        "wholeDay": {"$subtract": ["$date",{"$mod": ["$date",86400000]}]} 
    }},

    // Group on wholeDay ( _id insertion is monotonic )
    {"$group": 
        "_id": "$wholeDay",
        "docId": {"$last": "$_id" },
        "adco": {"$last": "$adco" },
        "hchc": {"$last": "$hchc" },
        "hchp": {"$last": "$hchp" },
        "hhphc": {"$last": "$hhphc" },
        "ptec": {"$last": "$ptec" },
        "iinst": {"$last": "$iinst" },
        "papp": {"$last": "$papp" },
        "imax": {"$last": "$imax" },
        "optarif": {"$last": "$optarif",
        "isousc": {"$last": "$isouc" },
        "motdetat": {"$last": "$motdetat" },
        "date": {"$last": "$date" },
    }}
])

So the principle here is that given the timestamp value, do the date math to project that as the midnight time at the beginning of each day. Then as the _id key on the document is already monotonic (always increasing), then simply group on the wholeDay value while pulling the $last document from the grouping boundary.

If you don't need all the fields then only project and group on the ones you want.

And yes you can do this in the spring data framework. I'm sure there is a wrapped command in there. But otherwise, the incantation to get to the native command goes something like this:

mongoOps.getCollection("yourCollection").aggregate( ... )

For the record, if you actually had BSON date types rather than a timestamp as a number, then you can skip the date math:

db.collection.aggregate([
    { "$group": { 
        "_id": { 
            "year": { "$year": "$date" },
            "month": { "$month": "$date" },
            "day": { "$dayOfMonth": "$date" }
        },
        "hchp": { "$last": "$hchp" }
    }}
])

Upvotes: 6

Related Questions