Reputation: 357
I'm using the library Cloudant in order to gather documents from a Cloudant Database. Everytime I run the python script I get all the documents but I would like to retrieve only the documents added from the last execution of the script, in other words a get_changes function.
I have searched for an answer but it not seems to be easely to find.
Thaks for any help,
Filippo.
Upvotes: 1
Views: 493
Reputation: 3737
Use the changes()
method. Keep track of the last sequence id, and restart from there to retrieve only the unseen changes.
# Iterate over a "normal" _changes feed
changes = db.changes()
for change in changes:
print(change)
# ...time passes
new_changes = db.changes(since=changes.last_seq)
for new_change in new_changes:
print(new_change)
If you also want the doc body, you can pass include_docs=True
.
See https://github.com/cloudant/python-cloudant/blob/master/src/cloudant/database.py#L458
If you want to capture only new additions (as opposed to all changes), you can either create a filter function in a db design doc along the lines of:
function(doc, req) {
// Skip deleted docs
if (doc._deleted) {
return false;
}
// Skip design docs
if (doc._id.startsWith('_design')) {
return false;
}
// Skip updates
if (!doc._rev.startsWith('1-')) {
return false;
}
return true;
}
and apply that to the changes feed:
new_changes = db.changes(since=changes.last_seq, filter='myddoc/myfilter'):
# do stuff here
but probably as easy to simply get all the changes and filter in the Python code.
Filter functions: https://console.bluemix.net/docs/services/Cloudant/guides/replication_guide.html#filtered-replication
Upvotes: 2