Reputation: 9291
I am a python noob (working with it for less than a few hours). I'm trying to read in twitter data and store it in a mongo database, but I am getting the following error:
Traceback (most recent call last):
File "twit_test.py", line 8, in on_receive
db.posts.insert(data)
File "/Library/Python/2.6/site-packages/pymongo-2.0.1-py2.6-macosx-10.6-universal.egg/pymongo/collection.py", line 274, in insert
File "/Library/Python/2.6/site-packages/pymongo-2.0.1-py2.6-macosx-10.6-universal.egg/pymongo/database.py", line 249, in _fix_incoming
File "/Library/Python/2.6/site-packages/pymongo-2.0.1-py2.6-macosx-10.6-universal.egg/pymongo/son_manipulator.py", line 73, in transform_incoming
TypeError: 'str' object does not support item assignment
Traceback (most recent call last):
File "twit_test.py", line 17, in <module>
conn.perform()
My code is very simple:
import pycurl, json
import pymongo
STREAM_URL = "https://stream.twitter.com/1/statuses/sample.json"
USER = "XXXXXXXX"
PASS = "XXXXXXXX"
def on_tweet(data):
tweet = json.loads(data)
db.posts.insert(tweet)
from pymongo import Connection
connection = Connection()
db = connection.test
conn = pycurl.Curl()
conn.setopt(pycurl.USERPWD, "%s:%s" % (USER, PASS))
conn.setopt(pycurl.URL, STREAM_URL)
conn.setopt(pycurl.WRITEFUNCTION, on_tweet)
conn.perform()
I'm sure this is a VERY simple fix, hope you guys can help. Thanks!
Upvotes: 3
Views: 7677
Reputation: 9291
The above edits/current code works. I was incorrectly querying the DB and expecting to see more traffic through the mongo console than I did.
Thanks much to the guys who helped, you got me on teh right track and to the right answer!
Upvotes: 0
Reputation: 26258
PyMongo's insert
method takes a dictionary, not a string. The error you're seeing is where PyMongo attempts to assign an ObjectId
for the new record (since it doesn't yet have one) before sending to the database.
I think the error is in your on_receive
function. Unless pycurl is converting the JSON for you automatically, it's very likely just giving you a raw string result from twitter's API. You should use the json module to decode the string, then handle the resulting type appropriately -- that is, if it's an array, iterate each item, determine whether it needs to be saved (i.e. whether you already have it in your database), and if not, then issue insert
just on those elements which are new.
EDIT: You should also add the safe=True
keyword argument to insert
. If there is an error that is caught on the server side, you will then get an exception from PyMongo which will help diagnose the problem.
Upvotes: 2
Reputation: 10278
On receive you have to buffer the content. When a "\r\n" comes, then you get a tweet and it can be stored in mongodb
def on_tweet(data):
tweet = json.loads(data)
db.posts.insert(tweet)
buffer = ""
def on_receive(data):
buffer += data.strip()
if (data.endswith("\r\n")):
if buffer:
on_tweet(buffer)
buffer = ""
EDIT : I though you were using old streaming api. "on_tweet" function should be enough
Upvotes: 2