Delete Rows prior to specific String in Pandas

Question

I have excel files in following format:

Sensor 1 meta
Sensor 2 meta
"Summary of Observation"
Sensor 1
Sensor 2

The number of rows before and after "Summary of Observation" is not fixed (i.e one file may have only sensor 1,2 while other may have 1,2,3....)

In dataframe, I only want information after "Summary of Observation")

Right now, I open the excel file, note the row from which I want information and parse it in

df = pd.read_excel("1.xlsx",skiprows = %put some value here%)

Is there a way to automate this, i.e. I don't want to open excel. Rather only import relevant rows (or delete them after importing).

Sociopath · Accepted Answer

After importing the file you can find index and select a data from that point.

# I used column name as `text` you can replace it with yours
idx = df[df['text']=='Summary of Observation'].index[0]

df = df[idx+1:]
print(df)

Output:

                    text                                                                                                                                                                                                                         
3                Sensor 1                                                                                                             
4                Sensor 2

Or if you want to include Summary of Observation just use idx in place of idx+1

Delete Rows prior to specific String in Pandas

Answers (2)

Related Questions