Reputation: 184
I am trying to read a csv file (~190MB in size) into a pandas dataframe, but I am getting this error. I am running the Pycharm IDE from JetBrains
Process finished with exit code -1073741819 (0xC0000005)
The code I am trying to run is below:
from pandas import DataFrame as df
if __name__ == '__main__':
frame = df()
frame.from_csv('c:/Nitin/692/Python/CSV/21LIVvTOT_user_geo_Reply.csv', header=True)
ab = list(frame.columns.values)
print(ab)
Here is an instance from the CSV:
createdat text coordinates entities id_str in_reply_to_user_id_str
Tue Feb 10 18:56:42 +0000 2015
"RT @RubieDubes: official list of deluded XXXXX:
Spurs Fans
Kanye West
Louis van Gaal"
{'trends': [], 'urls': [], 'user_mentions': [{'id': 65174814, 'name': 'Ruby ?', 'screen_name': 'RubieDubes', 'indices': [3, 14], 'id_str': '65174814'}], 'symbols': [], 'hashtags': []}
5.65223E+17
EDIT: I tried running it using the python console and it resulted in an error: An unhandled win32 exception occurred in python.exe [11640].
Upvotes: 3
Views: 15000
Reputation: 184
I figured out what the issue was. There were values in the CSV like that were not being properly read by the parser. I changed the code from
frame.from_csv('c:/Nitin/692/Python/CSV/21LIVvTOT_user_geo_Reply.csv', header=True)
to
data = pandas.read_csv('c:/Nitin/692/Python/CSV/21LIVvTOT_user_geo_Reply.csv', encoding='latin-1', engine='python')
Guess the encoding in utf-8 was causing the problem. The code ran when I changed it to 'latin-1. Thank you for your help.
EDIT: I figured that this was caused due to the emojis present in the data.
Upvotes: 3