Reputation: 23
Trying to read the csv file through pandas, but it looks like it is not reading it correctly
Code:
pd.read_csv(data_file_path, sep=",", index_col=0, header=0, dtype = object)
For eg: My data is (in csv file):
12 1.43E+19 This is first line 101010
23 1.43E+19 This is the second line 202020
34 1.43E+19 This is the third line 303030
I am trying to read with first column as index.
Output:
1.43E+19 This is first line 101010
12
23 1.43E+19 This is the second line 202020
34 1.43E+19 This is the third line 303030
Output without making 1st column as index:
12 1.43E+19 This is first line 101010
0 23 1.43E+19 This is the second line 202020
1 34 1.43E+19 This is the third line 303030
Because of this, any further processing on this data is ignoring the first row data.
Upvotes: 2
Views: 2808
Reputation: 353604
I think you're confusing header=0
, which means "use the 0-th row as the header", with header=None
, which means "don't read a header from the file".
Compare:
>>> pd.read_csv("h.csv", header=0, index_col=0)
1.43E+19 This is first line 101010
12
23 1.430000e+19 This is the second line 202020
34 1.430000e+19 This is the third line 303030
>>> pd.read_csv("h.csv", header=None, index_col=0)
1 2 3
0
12 1.430000e+19 This is first line 101010
23 1.430000e+19 This is the second line 202020
34 1.430000e+19 This is the third line 303030
You can also specify column names using names
:
>>> pd.read_csv("h.csv", names=["Number", "Line", "Code"], index_col=0)
Number Line Code
12 1.430000e+19 This is first line 101010
23 1.430000e+19 This is the second line 202020
34 1.430000e+19 This is the third line 303030
PS: Since you're using sep=","
but the file you showed doesn't have any commas, I'm assuming that you removed them for some reason when asking the question. If that's right, please don't: no one's afraid of commas, and it simply means that other people have to guess where to put them back in if they want to test your code.
Upvotes: 1