Reputation: 13
I'm creating a project where I'll receive a list of tweets (Twitter), and then check if there words inside of a dictionary
, which has words that certain values. I've gotten my code to take the words, but I don't know how to eliminate the symbols like: , . "
:
Here's the code:
def getTweet(tweet, dictionary):
score = 0
seperate = tweet.split(' ')
print seperate
print "------"
if(len(tweet) > 0):
for item in seperate:
if item in dictionary:
print item
score = score + int(dictionary[item])
print "here's the score: " + str(score)
return score
else:
print "you haven't tweeted a tweet"
return 0
Here's the parameter/tweet that will be checked:
getTweet("you are the best loyal friendly happy cool nice", scoresDict)
Any ideas?
Upvotes: 0
Views: 119
Reputation: 728
If you want to get rid of all the non alphanumerical values you can try:
import re
re.sub(r'[^\w]', ' ', string)
the flag [^\w] will do the trick for you!
Upvotes: 1
Reputation: 69192
Before doing the split, replace the characters with spaces, and then split on the spaces.
import re
line = ' a.,b"c'
line = re.sub('[,."]', ' ', line)
print line # ' a b c'
Upvotes: 0