Reputation: 11592
Background - I am storing some tweets in CouchDb, and I want to make it easy to prevent storing duplicate tweets by using the Twitter Id as the CouchDb document id.
I am using a Python script to extract some tweets from Twitter using the python-twitter library, which returns a collection of objects, each of which contains the unique Twitter tweet id as a property (twitter.Status.id). I would like to use this as the CouchDb document id when saving the tweet into CouchDb.
>>> import twitter
>>> api = twitter.Api()
>>> statuses = api.GetSearch('xyz')
>>> s = statuses[0] # save just the first one for now
>>> import couchdb
>>> couch = couchdb.Server()
>>> dbcouch = couch['tweets']
>>> dbcouch.save(s.AsDict())
(u'fd55e5944267266892f076891a3d9ac4', u'1-4a50a618afd4dc68373155b1ad3e96a1')
CouchDb has set a unique Id, which is not what I want. The documentation has examples of setting a manual doc id when the document is being built from scratch, but in this case (where the object is handed to me) I can't seem to make it work.
Upvotes: 3
Views: 235
Reputation: 73752
You need to get an "_id"
field in the status before saving it to couch. For example (my Python is rusty)
>>> s = statuses[0] # save just the first one for now
>>> tweet = s.AsDict()
>>> tweet["_id"] = "%d" % tweet["id"]
>>> # (Couch stuff same as before)
>>> dbcouch.save(tweet)
How does that work?
Upvotes: 2