Reputation: 840
I have multiple devices which generates data using their sensors and send to my sever. On the server side telemetry data saved in json format in the postgresql database. Sensors data saved in a partitioned table(named statuses
) which partitioned on time_stamp
monthly and device_id
. Also device info saved in a regular table (named devices
).
As an example of status data of two months we have:
statuses_p2023_06
time_stamp | device_id | status
2023-06-21 06:17:46.409+00 | 2 | {"humidity": "17", "wind":"14"}
2023-06-21 06:15:46.409+00 | 1 | {"temperature": "22", "wind":"14"}
2023-06-21 06:15:26.409+00 | 2 | {"humidity": "15", "wind": "12"}
2023-06-21 06:13:46.409+00 | 1 | {"temperature": "23", "humidity":"9"}
.
.
.
statuses_p2023_07
time_stamp | device_id | status
2023-07-11 08:17:46.409+00 | 1 | {"temperature": "23", "humidity":"9"}
2023-07-11 08:15:46.409+00 | 1 | {"temperature": "24"}
2023-07-11 08:15:26.409+00 | 2 | {"humidity": "10", "wind":"11"}
2023-07-11 08:13:46.409+00 | 1 | {"wind": "12"}
Since there is frequent queries like SELECT * FROM statuses WHERE time_stamp > '2023-07-01' and device_id=1 and status->>'temperature' > 22;
we must could search in json data efficiently. But as we know search on json data in postgresql is not very efficient(i.e I could not create index in on every property). So I used debezium as a CDC tool to transform data from postgresql to mongodb in order to search in json data. Debezium aggregates partitioned statuses
data into a single table and save on the mongodb. So now I could search more effiecienlty in the status data of devices but what about partitioning of data in the long time?
The problem is Debezium transformed data with upsert
write policy and mongodb timeseries does not support upsert
.
Is there any way to change debezium write policy or handle partitioning with upsert
in mongodb?
Upvotes: 2
Views: 132
Reputation: 141
There is no way to "change" Debezium "write policy". Even if there would be a switch for changing the "write policy", MongoDB still could not upsert time series data.
There is no way to handle partitioning with upsert
in MongoDB, as you pointed out.
Seems like searching with MongoDB has solved your search performance problem. If your partitioning of data in the "long time" means performance is not the focus anymore, you should upsert
your time series data in PostgreSQL with INSERT ON CONFLICT
Upvotes: -1