Martan
Martan

Reputation: 605

Delete Rows prior to specific String in Pandas

I have excel files in following format:

Sensor 1 meta
Sensor 2 meta
"Summary of Observation"
Sensor 1
Sensor 2

The number of rows before and after "Summary of Observation" is not fixed (i.e one file may have only sensor 1,2 while other may have 1,2,3....)

In dataframe, I only want information after "Summary of Observation")

Right now, I open the excel file, note the row from which I want information and parse it in

df = pd.read_excel("1.xlsx",skiprows = %put some value here%)

Is there a way to automate this, i.e. I don't want to open excel. Rather only import relevant rows (or delete them after importing).

Upvotes: 1

Views: 1197

Answers (2)

Sociopath
Sociopath

Reputation: 13426

After importing the file you can find index and select a data from that point.

# I used column name as `text` you can replace it with yours
idx = df[df['text']=='Summary of Observation'].index[0]

df = df[idx+1:]
print(df)

Output:

                    text                                                                                                                                                                                                                         
3                Sensor 1                                                                                                             
4                Sensor 2 

Or if you want to include Summary of Observation just use idx in place of idx+1

Upvotes: 2

Gokul C
Gokul C

Reputation: 151

you can open the excel and use df.loc[df[0]=="Summary of Observation"].index[0] to get the index Working code at https://github.com/gklc811/Python3.6/blob/master/stackoverflowsamples/excel.ipynb

Upvotes: 1

Related Questions