Reputation: 15
I'm trying to import a .csv file in Python using pandas but the output is an error code.
It's my very beginning with python and also with pandas, I started with a good tutorial on YouTube where the test data was also an .csv file and with this file my code works.
The file which I want to use is a .csv file but it has already separated columns, the test data file didn't have separate columns and the data is separated with ",".
So does anyone have a suggestion to solve my problem?
import pandas as pd
df = pd.read_csv("feedPreview.csv")
print(df)
Output:
ParserError Traceback (most recent call last)
<ipython-input-2-09462199d5bd> in <module>
1 import pandas as pd
2
----> 3 df = pd.read_csv("feedPreview.csv")
4
5 print(df)
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
683 )
684
--> 685 return _read(filepath_or_buffer, kwds)
686
687 parser_f.__name__ = name
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
461
462 try:
--> 463 data = parser.read(nrows)
464 finally:
465 parser.close()
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
1152 def read(self, nrows=None):
1153 nrows = _validate_integer("nrows", nrows)
-> 1154 ret = self._engine.read(nrows)
1155
1156 # May alter columns / col_dict
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
2057 def read(self, nrows=None):
2058 try:
-> 2059 data = self._reader.read(nrows)
2060 except StopIteration:
2061 if self._first_chunk:
pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader.read()
pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()
pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_rows()
pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()
pandas\_libs\parsers.pyx in pandas._libs.parsers.raise_parser_error()
ParserError: Error tokenizing data. C error: Expected 4 fields in line 33, saw 5
Upvotes: 0
Views: 237
Reputation: 433
It's hard to say for sure without seeing the data, but it seems that the line 33 of your file has 5 fields instead of 4. If you think you can import the data without this line (and other lines that may have the same problem), you can try this:
df = pd.read_csv('feedPreview.csv', error_bad_lines=False)
As said in pandas documentation here:
"Lines with too many fields (e.g. a csv line with too many commas) will by default cause an exception to be raised, and no DataFrame will be returned. If False, then these “bad lines” will dropped from the DataFrame that is returned."
Upvotes: 1
Reputation: 2260
you can find the line that has 5 fields by doing:
with open(csv_file, 'r') as f:
for i, l in f.readlines():
if len(l.split(',') > 4:
print(i)
then open the file with an editor and correct it
Upvotes: 2