Reputation: 3
I have been trying for a few hours to read this file. I have tried researching solutions and applying them. THey did not work. The file itself opens fine on Excel, but I cannot read it with Pandas.
The response keeps returning the same error: ParserError: Expected 3 fields in line 5, saw 63
I have seen a few other questions on this topic, but none of the solutions to those specific questions has solved my issue.
Does anyone know why I am failing to read this file and how I can fix it? Thank you
IN:
data=pd.read_csv('API_EN.ATM.CO2E.PC_DS2_en_csv_v2_10181020.csv',
header=None,
engine='python',
error_bad_lines=True)
OUT:
ParserError Traceback (most recent call last)
<ipython-input-96-0d42116a039d> in <module>()
2 header=None,
3 engine='python',
----> 4 error_bad_lines=True)
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, doublequote, delim_whitespace, low_memory, memory_map, float_precision)
676 skip_blank_lines=skip_blank_lines)
677
--> 678 return _read(filepath_or_buffer, kwds)
679
680 parser_f.__name__ = name
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
444
445 try:
--> 446 data = parser.read(nrows)
447 finally:
448 parser.close()
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
1034 raise ValueError('skipfooter not supported for iteration')
1035
-> 1036 ret = self._engine.read(nrows)
1037
1038 # May alter columns / col_dict
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, rows)
2264 content = content[1:]
2265
-> 2266 alldata = self._rows_to_cols(content)
2267 data = self._exclude_implicit_index(alldata)
2268
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _rows_to_cols(self, content)
2907 msg += '. ' + reason
2908
-> 2909 self._alert_malformed(msg, row_num + 1)
2910
2911 # see gh-13320
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _alert_malformed(self, msg, row_num)
2674
2675 if self.error_bad_lines:
-> 2676 raise ParserError(msg)
2677 elif self.warn_bad_lines:
2678 base = 'Skipping line {row_num}: '.format(row_num=row_num)
ParserError: Expected 3 fields in line 5, saw 63
Here is a sample of the CSV file:
"Country_Name","Country_Code","Indicator_Name","Indicator_Code","1960","1961","1962","1963","1964","1965","1966","1967","1968","1969","1970","1971","1972","1973","1974","1975","1976","1977","1978","1979","1980","1981","1982","1983","1984","1985","1986","1987","1988","1989","1990","1991","1992","1993","1994","1995","1996","1997","1998","1999","2000","2001","2002","2003","2004","2005","2006","2007","2008","2009","2010","2011","2012","2013","2014","2015","2016","2017",
"Aruba","ABW","CO2 emissions (metric tons per capita)","EN.ATM.CO2E.PC","","","","","","","","","","","","","","","","","","","","","","","","","","","2.86831939212055","7.23519803341258","10.0261792105306","10.6347325992922","26.3745032100275","26.0461298009966","21.4425588041328","22.000786163522","21.0362451108214","20.7719361585578","20.3183533653846","20.4268177083943","20.5876691453648","20.311566765912","26.1948752380219","25.9340244138733","25.6711617820448","26.4204520857169","26.5172934158421","27.200707780588","26.9482604728658","27.8955739972338","26.2308466448946","25.9158329472761","24.6705288731078","24.5058352032767","13.1555416906324","8.35129425218293","8.408362637892","","","",
Upvotes: 0
Views: 1526
Reputation: 2696
Changing your code to
data=pd.read_csv('API_EN.ATM.CO2E.PC_DS2_en_csv_v2_10181020.csv', header=None, engine='python', error_bad_lines=False)
will import your csv, but wont correctly import your csv. Probably there is something with your csv and the separator used. Could you post the 5th line of the csv you are trying to import? Does the last column for example contain text with comma's? How many columns do you expect: 3, 63, or something else?
Upvotes: 1