Reputation: 605
I have excel files in following format:
Sensor 1 meta
Sensor 2 meta
"Summary of Observation"
Sensor 1
Sensor 2
The number of rows before and after "Summary of Observation" is not fixed (i.e one file may have only sensor 1,2 while other may have 1,2,3....)
In dataframe, I only want information after "Summary of Observation")
Right now, I open the excel file, note the row from which I want information and parse it in
df = pd.read_excel("1.xlsx",skiprows = %put some value here%)
Is there a way to automate this, i.e. I don't want to open excel. Rather only import relevant rows (or delete them after importing).
Upvotes: 1
Views: 1197
Reputation: 13426
After importing the file you can find index
and select a data from that point.
# I used column name as `text` you can replace it with yours
idx = df[df['text']=='Summary of Observation'].index[0]
df = df[idx+1:]
print(df)
Output:
text
3 Sensor 1
4 Sensor 2
Or if you want to include Summary of Observation just use idx
in place of idx+1
Upvotes: 2
Reputation: 151
you can open the excel and use df.loc[df[0]=="Summary of Observation"].index[0]
to get the index
Working code at https://github.com/gklc811/Python3.6/blob/master/stackoverflowsamples/excel.ipynb
Upvotes: 1