Reputation: 1
I am trying to calculate the number of tweets of a single word for a single year while writing down each day and its number of tweets and store than to store it in CSV file with "Date" and "Frequency." This is my code, but I keep getting an error after running for some time.
import pandas as pd
import twint
import nest_asyncio
from datetime import datetime,timedelta
bugun = '2020-01-01'
yarin = '2020-01-02'
df = pd.DataFrame(columns=("Data","Frequency"))
for i in range(365):
file = open("Test.csv","w")
file.close()
bugun = (datetime.strptime(bugun, '%Y-%m-%d') + timedelta(days=1)).strftime('%Y-%m-%d')
yarin =(datetime.strptime(yarin, '%Y-%m-%d') + timedelta(days=1)).strftime('%Y-%m-%d')
nest_asyncio.apply()
c = twint.Config()
c.Search = "Chainlink"
#c.Hide_output=True
c.Since= bugun
c.Until= yarin
c.Store_csv = True
c.Output = "Test.csv"
c.Count = True
twint.run.Search(c)
data = pd.read_csv("Test.csv")
frequency = str(len(data))
#d = {"Data": [bugun], "Frequency": [frequency]}
#d_f = pd.DataFrame(data=d)
#df = df.append(d_f, ignore_index=True)
df.loc[i] = [bugun] + [frequency]
df.to_csv (r'C:\Users\serap\Desktop\CRYPTO 100\Chainlink.csv',index = False, header=False)
and the error I get is this
File "C:\Users\serap\Desktop\CRYPTO 100\CODES\Binance_Coin\Binance Coin.py", line 47, in <module>
data = pd.read_csv("Test.csv")
File "C:\Users\serap\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 605, in read_csv
return _read(filepath_or_buffer, kwds)
File "C:\Users\serap\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 457, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Users\serap\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 814, in __init__
self._engine = self._make_engine(self.engine)
File "C:\Users\serap\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 1045, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "C:\Users\serap\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 1893, in __init__
self._reader = parsers.TextReader(self.handles.handle, **kwds)
File "pandas\_libs\parsers.pyx", line 521, in pandas._libs.parsers.TextReader.__cinit__
EmptyDataError: No columns to parse from file
Thank you for the help :)
Upvotes: 0
Views: 664
Reputation: 10238
After reading a tutorial How to Scrape Tweets from Twitter with Python Twint | by Andika Pratama | Analytics Vidhya | Medium, I think you better let Twint do the iteration:
c = twint.Config()
c.Search = "Chainlink"
c.Since = "2020–01–01"
c.Until = "2021–01–01"
c.Store_csv = True
c.Output = "Test.csv"
c.Count = True
twint.run.Search(c)
Now you may loop over the CSV output:
data = pd.read_csv("Test.csv")
# ...
Until now, I didn't find this detail about CSV output documented, but the twint source code (master/twint/storage/write.py
(line 58 ff)) tells, that for CSV the output is appended if the file already exists. So you may have to truncate it or delete an existing file before. A valid option for this could be
open(`Test.csv`, 'w').close()
... which is basically the same you do but without introducing another variable.
Upvotes: 1