Reputation: 1
I am working on streaming tweets using PYSpark in real-time.
I want to retrieve text, location, username. Currently, I am receiving tweet text only. Is there is anyway to get the location also.
lines = ssc.socketTextStream("localhost", 5550)
I'm using this line of code to get the tweets.
Upvotes: 0
Views: 104
Reputation: 1
I just found the answer .. We need to update the twitter listener ..
def on_data(self, data):
try:
msg = json.loads(data)
if ('retweeted_status' in msg):
if ('extended_tweet' in msg['retweeted_status']):
print(msg['retweeted_status']['extended_tweet']['full_text'])
print(" | The Location is " + str(msg['user']['location']) )
self.client_socket.send((str(msg['retweeted_status']['extended_tweet']['full_text']) + "\n").encode('utf-8'))
elif ('extended_status' in msg):
print(msg['extended_status']['full_text'])
print(" | The Location is " + str(msg['user']['location']) )
self.client_socket.send((str(msg['extended_status']['full_text']) + "\n").encode('utf-8'))
else:
print(msg['text'])
print(" | The Location is " + str(msg['user']['location']) )
self.client_socket.send((str(msg['text']) + "\n").encode('utf-8'))
except BaseException as e:
print("Error on_data: %s" % str(e))
return True
Upvotes: 0