Reputation: 21
I have some txt files and they start with a lot of BS and after 20 to 30 lines the useful part starts. I want to use the last line before the numbers as my header. I know If I know the exact line number, I can set that as my header (using pd.read_csv) but for each file, that number is different (as I said it's between 20 to 30). I know the line that I am looking for starts with "Potential". Is there any easy way to use pd.read_csv and set the header from the beginning.
Upvotes: 2
Views: 226
Reputation: 57033
You can read the top of the file using "traditional" file I/O methods and count the rows until you find the header row. Once you know its number, reread the file with pandas.read_csv()
.
with open(yourfile) as infile:
for n,row in enumerate(infile):
if row.startswith("Potential"):
break
df = pd.read_csv(yourfile, skiprows=n)
Upvotes: 5