Himanshu
Himanshu

Reputation: 23

Python: Not able to read first row of csv file correctly using read_csv

Trying to read the csv file through pandas, but it looks like it is not reading it correctly

Code:

pd.read_csv(data_file_path, sep=",", index_col=0, header=0, dtype = object)

For eg: My data is (in csv file):

12 1.43E+19 This is first line  101010  
23 1.43E+19 This is the second line 202020  
34 1.43E+19 This is the third line  303030  

I am trying to read with first column as index.

Output:

     1.43E+19 This is first line    101010  
12  
23 1.43E+19 This is the second line 202020  
34 1.43E+19 This is the third line 303030  

Output without making 1st column as index:

  12 1.43E+19 This is first line 101010  
0 23 1.43E+19 This is the second line 202020  
1 34 1.43E+19 This is the third line 303030  

Because of this, any further processing on this data is ignoring the first row data.

Upvotes: 2

Views: 2808

Answers (1)

DSM
DSM

Reputation: 353604

I think you're confusing header=0, which means "use the 0-th row as the header", with header=None, which means "don't read a header from the file".

Compare:

>>> pd.read_csv("h.csv", header=0, index_col=0)
        1.43E+19       This is first line  101010  
12                                                 
23  1.430000e+19  This is the second line    202020
34  1.430000e+19   This is the third line    303030
>>> pd.read_csv("h.csv", header=None, index_col=0)
               1                        2       3
0                                                
12  1.430000e+19       This is first line  101010
23  1.430000e+19  This is the second line  202020
34  1.430000e+19   This is the third line  303030

You can also specify column names using names:

>>> pd.read_csv("h.csv", names=["Number", "Line", "Code"], index_col=0)
          Number                     Line    Code
12  1.430000e+19       This is first line  101010
23  1.430000e+19  This is the second line  202020
34  1.430000e+19   This is the third line  303030

PS: Since you're using sep="," but the file you showed doesn't have any commas, I'm assuming that you removed them for some reason when asking the question. If that's right, please don't: no one's afraid of commas, and it simply means that other people have to guess where to put them back in if they want to test your code.

Upvotes: 1

Related Questions