Red_Wheelbarrow
Red_Wheelbarrow

Reputation: 15

Import .csv in Python using Pandas

I'm trying to import a .csv file in Python using pandas but the output is an error code.

It's my very beginning with python and also with pandas, I started with a good tutorial on YouTube where the test data was also an .csv file and with this file my code works.

The file which I want to use is a .csv file but it has already separated columns, the test data file didn't have separate columns and the data is separated with ",".

So does anyone have a suggestion to solve my problem?

import pandas as pd
df = pd.read_csv("feedPreview.csv")
print(df)

Output:

ParserError                               Traceback (most recent call last)
<ipython-input-2-09462199d5bd> in <module>
      1 import pandas as pd
      2 
----> 3 df = pd.read_csv("feedPreview.csv")
      4 
      5 print(df)

~\Anaconda3\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
    683         )
    684 
--> 685         return _read(filepath_or_buffer, kwds)
    686 
    687     parser_f.__name__ = name

~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
    461 
    462     try:
--> 463         data = parser.read(nrows)
    464     finally:
    465         parser.close()

~\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
   1152     def read(self, nrows=None):
   1153         nrows = _validate_integer("nrows", nrows)
-> 1154         ret = self._engine.read(nrows)
   1155 
   1156         # May alter columns / col_dict

~\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
   2057     def read(self, nrows=None):
   2058         try:
-> 2059             data = self._reader.read(nrows)
   2060         except StopIteration:
   2061             if self._first_chunk:

pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()

pandas\_libs\parsers.pyx in pandas._libs.parsers.raise_parser_error()

ParserError: Error tokenizing data. C error: Expected 4 fields in line 33, saw 5

Upvotes: 0

Views: 237

Answers (2)

Leda Grasiele
Leda Grasiele

Reputation: 433

It's hard to say for sure without seeing the data, but it seems that the line 33 of your file has 5 fields instead of 4. If you think you can import the data without this line (and other lines that may have the same problem), you can try this:

 df = pd.read_csv('feedPreview.csv', error_bad_lines=False)

As said in pandas documentation here:

"Lines with too many fields (e.g. a csv line with too many commas) will by default cause an exception to be raised, and no DataFrame will be returned. If False, then these “bad lines” will dropped from the DataFrame that is returned."

Upvotes: 1

dzang
dzang

Reputation: 2260

you can find the line that has 5 fields by doing:

with open(csv_file, 'r') as f:
   for i, l in f.readlines():
       if len(l.split(',') > 4:
           print(i)

then open the file with an editor and correct it

Upvotes: 2

Related Questions