Ferit
Ferit

Reputation: 648

Skip first lines of CSV in Python - does not work? Why?

My csv data looks like as follows:

$('table').each(function(index) {
    $(this).find('tr').each(function() {
      $(this).find('td:first-child').each(function(i) { 
        var a;
        var b;
        var c;
        $(this).find('a:first').each(function(i) { 
          a = $(this).text();
        });
        $(this).find('p:first').each(function(i) { 
          b = ($(this).text());
        });
        $(this).find('time:first').each(function(i) { 
          c = $(this).text();
        });
        console.log(a + ";"  + b + ";" + c);
      });
    });
  });
XA452:01 Description in Column 1;ID Column1;13.03.2018
AY102:22 Description in Column 2;ID Column2;13.03.2018
BC001:31 Description in Column 3;ID Column3;13.03.2018
DE223:34 Description in Column 4;ID Column4;13.03.2018
FG315:56 Description in Column 5;ID Column5;13.03.2018
HA212:34 Description in Column 6;ID Column6;13.03.2018
EE111:12 Description in Column 7;ID Column7;13.03.2018

I want to start parsing the data where the row begins with XA452:01.

I tried this:

import pandas as pd

testimport_data = pd.read_csv("C:/Users/fff/Desktop/test_data.txt", sep=";", skiprows = 19)

print(testimport_data)

It should work, shouldn't it be? However, I get the following error message:

Traceback (most recent call last):
  File "C:/Users/fff/PycharmProjects/Test/Test.py", line 3, in <module>
    testimport_data = pd.read_csv("C:/Users/fff/Desktop/test_data", sep=";", skiprows = 19)
  File "C:\Users\fff\PycharmProjects\Test\venv\lib\site-packages\pandas\io\parsers.py", line 709, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\fff\PycharmProjects\Test\venv\lib\site-packages\pandas\io\parsers.py", line 449, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "C:\Users\fff\PycharmProjects\Test\venv\lib\site-packages\pandas\io\parsers.py", line 818, in __init__
    self._make_engine(self.engine)
  File "C:\Users\fff\PycharmProjects\Test\venv\lib\site-packages\pandas\io\parsers.py", line 1049, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "C:\Users\fff\PycharmProjects\Test\venv\lib\site-packages\pandas\io\parsers.py", line 1695, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas\_libs\parsers.pyx", line 565, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file

What am I doing wrong?

Upvotes: 0

Views: 98

Answers (2)

akozi
akozi

Reputation: 455

Edit: What I say below is not completely true as Rahul pointed out. The problem arises since the text that is skipped included a semicolon. Therefore my switching of semicolon to a coma fixes this issue since there are no commas in the skipped text.

Old text: This is an issue with ; as a separator. Why I'm not sure but if you switch the text files ';' to ',' then the program runs fine.

In the past I've had this work by changing the what engine pandas uses to read the file if you need it use semi colons.

The file I used is called values.txt

$('table').each(function(index) {
    $(this).find('tr').each(function() {
      $(this).find('td:first-child').each(function(i) { 
        var a;
        var b;
        var c;
        $(this).find('a:first').each(function(i) { 
          a = $(this).text();
        });
        $(this).find('p:first').each(function(i) { 
          b = ($(this).text());
        });
        $(this).find('time:first').each(function(i) { 
          c = $(this).text();
        });
        console.log(a + ";"  + b + ";" + c);
      });
    });
  });
XA452:01 Description in Column 1,ID Column1,13.03.2018
AY102:22 Description in Column 2,ID Column2,13.03.2018
BC001:31 Description in Column 3,ID Column3,13.03.2018
DE223:34 Description in Column 4,ID Column4,13.03.2018
FG315:56 Description in Column 5,ID Column5,13.03.2018
HA212:34 Description in Column 6,ID Column6,13.03.2018
EE111:12 Description in Column 7,ID Column7,13.03.2018

Then ran it with:

>>> data = pd.read_csv('values.txt', sep=',', skiprows=19, names=['1', '2', '3'])
>>> data
                                  1           2           3
0  XA452:01 Description in Column 1  ID Column1  13.03.2018
1  AY102:22 Description in Column 2  ID Column2  13.03.2018
2  BC001:31 Description in Column 3  ID Column3  13.03.2018
3  DE223:34 Description in Column 4  ID Column4  13.03.2018
4  FG315:56 Description in Column 5  ID Column5  13.03.2018
5  HA212:34 Description in Column 6  ID Column6  13.03.2018
6  EE111:12 Description in Column 7  ID Column7  13.03.2018

Upvotes: 1

Rahul
Rahul

Reputation: 11550

You can skeep reading that lines.

import pandas as pd

with open('pd_csv.csv') as f:
    data = [line.split(";") for line in f.readlines()[19:]]

testimport_data = pd.DataFrame(data)
print(testimport_data)

Upvotes: 2

Related Questions