Kiran
Kiran

Reputation: 55

Read Excel file and skip empty rows

I have a excel sheet containing data as shown below, It contains data column wise.

Rahul     e34   Pradeep  e44  Azhar  t54  
Venkat    r45   Akash    e14  Vipul  r15  Fairo   e45 
Akshay    e44   
Pavan     e24   Asad     t14

When i run the below code

import pandas as pd
import numpy as np 
df = pd.read_excel (r'C:\\Users\\Kiran\\Desktop\\Data\\Output1.xlsx')
df=pd.DataFrame(np.reshape(df.to_numpy(),(-1,2)))
df.to_excel("Output2.xlsx")

I get output as

     0           1
0   Rahul       e34   
1   Pradeep     e44  
2   Azhar       t54  
3       
4   Venkat      r45   
5   Akash       e14  
6   Vipul       r15  
7   Fairo       e45 
8   Akshay      e44    
9       
10      
11      
12  Pavan       e24   
13  Asad        t14
14      
15     

I need output as shown below

Rahul       e34   
Pradeep     e44  
Azhar       t54     
Venkat      r45   
Akash       e14  
Vipul       r15  
Fairo       e45 
Akshay      e44    
Pavan       e24   
Asad        t14

Row 1 (0,1) and 1st Column (0,1,2,3,4.....15) should not appear along with how to remove empty spaces. Can anyone please guide me. Appreciate your help.

Upvotes: 0

Views: 5026

Answers (1)

Kaustubh Lohani
Kaustubh Lohani

Reputation: 655

You can use the index & header argument in the df.to_excel(). Also use header=None while reading excel into pandas DF. If you don't you will be missing an observation in the output file.

Here is the working code:

import pandas as pd
import numpy as np 

df = pd.read_excel ('input.xlsx',header=None)
df=pd.DataFrame(np.reshape(df.to_numpy(),(-1,2)))

df.dropna(axis=0,inplace=True)
clean_df=df[:].astype('str').apply(lambda x: x.str.strip())

print(clean_df)

df.to_excel("Output.xlsx",index=False,header=False)

Upvotes: 1

Related Questions