JBaczuk
JBaczuk

Reputation: 14639

How to prevent duplicates in mongodb time series collection

Problem

Sensors check-in periodically, but network connectivity issues may cause them to check-in with the same data more than once.

MongoDB does not allow the unique property on secondary indexes for time series collections (MongoDB 5.0). Timeseries Limitations

In addition, calculations need to be done on the data (preferably using aggregations) that involve counting the number of entries, which will be inaccurate if there are duplicates. Not to mention it bloats the database and is just messy.

Question

Is there any way to prevent duplicate entries in a MongoDB Timeseries collection?

Upvotes: 2

Views: 1877

Answers (1)

Goums
Goums

Reputation: 126

I'm having the same issue.

According to official answer in MongoDB Community, there is no way to ensure unique values in timeseries collection.

You can check the full explanations here: https://www.mongodb.com/community/forums/t/duplicate-data-issue/135023

They consider it a caveat of timeseries compare to normal collection. IMO, it's a crucial lack in the timeseries capability of mongodb...

There is currently two available solutions:

  1. Use "normal" collection with a compound unique index on your timestamp and sensor_id fields
  2. Keep using timeseries collection, but query your data only through aggregation pipeline with a $group stage to eliminate duplicate entries

Upvotes: 3

Related Questions