Reputation: 21
started learning this stuff today so please forgive my ignorance.
My data is in csv and as described in the title, I would like to exclude the first and third row while keeping the second row as headers. The csv looks like this:
"Title"
Date, time, count, hours, average
"empty row"
The data set starts in the row following empty row.
Upvotes: 0
Views: 617
Reputation: 862511
Use parameter header=1
in read_csv
for reading second row to columns only because empty rows are excluded by default:
import pandas as pd
temp=u"""Title
Date,time,count,hours,average
2015-01-01,25:02:10,10,20,15"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), header=1)
print (df)
Date time count hours average
0 2015-01-01 25:02:10 10 20 15
Upvotes: 0
Reputation: 164623
Using the skiprows
parameter of pd.read_csv
:
from io import StringIO
x = StringIO("""Title
Date, time, count, hours, average
2018-01-01, 15:23, 16, 10, 5.5
2018-01-02, 16:33, 20, 5, 12.25
""")
# replace x with 'file.csv'
df = pd.read_csv(x, skiprows=[0, 2])
print(df)
Date time count hours average
0 2018-01-01 15:23 16 10 5.50
1 2018-01-02 16:33 20 5 12.25
In fact, skiprows=[0]
suffices as empty rows are excluded by default, i.e. default behavior is skip_blank_lines=True
.
Upvotes: 3