Reputation: 749
I am collecting twitter data in txt file via streaming and I use this file for filtering and various queries using ipython notebook. I find that sometimes when I have a heavy data file the command gets stuck somewhere around 'text' a category in twitter data. I need the way around to handle the data so that I am not stuck. I am pasting below what happens.
tweets_ISIS = tweets['text'].apply(lambda tweet: word_in_text('ISIS', tweets))
Here is the output:
AttributeError Traceback (most recent call last)
<ipython-input-34-444b712d99dc> in <module>()
----> 1 tweets_ISIS = tweets['text'].apply(lambda tweet: word_in_text('ISIS', tweets))
/usr/lib64/python2.7/site-packages/pandas/core/series.pyc in apply(self, func, convert_dtype, args, **kwds)
2167 values = lib.map_infer(values, lib.Timestamp)
2168
-> 2169 mapped = lib.map_infer(values, f, convert=convert_dtype)
2170 if len(mapped) and isinstance(mapped[0], Series):
2171 from pandas.core.frame import DataFrame
pandas/src/inference.pyx in pandas.lib.map_infer (pandas/lib.c:62578)()
<ipython-input-34-444b712d99dc> in <lambda>(tweet)
----> 1 tweets_ISIS = tweets['text'].apply(lambda tweet: word_in_text('ISIS', tweets))
<ipython-input-33-0ee00dabf341> in word_in_text(word, text)
1 def word_in_text(word, text):
2 word = word.lower()
----> 3 text = text.lower()
4 match = re.search(word, text)
5 if match:
/usr/lib64/python2.7/site-packages/pandas/core/generic.pyc in __getattr__(self, name)
2358 return self[name]
2359 raise AttributeError("'%s' object has no attribute '%s'" %
-> 2360 (type(self).__name__, name))
2361
2362 def __setattr__(self, name, value):
AttributeError: 'DataFrame' object has no attribute 'lower
I defined as follows import re
:
def word_in_text(word, text):
word = word.lower()
text = text.lower()
match = re.search(word, text)
if match:
return True
return False
Upvotes: 0
Views: 114
Reputation: 749
I defined as follows import re
def word_in_text(word, text):
word = word.lower()
text = text.lower()
match = re.search(word, text)
if match:
return True
return False
Upvotes: 0
Reputation: 1893
You are using tweets
when I think your intention was to use tweet
. You are passing the DataFrame to word_in_text()
rather than passing the input of the lambda function to word_in_text()
. Try:
tweets_ISIS = tweets['text'].apply(lambda tweet: word_in_text('ISIS', tweet))
Also, is apply()
the right function to use here? Based on the limited context, it seems that map()
might the be proper choice to run word_in_text()
on each value in the text
Series, but I can't tell for sure without something more complete and reproducible.
Upvotes: 1