Reputation: 55
I have a excel sheet containing data as shown below, It contains data column wise.
Rahul e34 Pradeep e44 Azhar t54
Venkat r45 Akash e14 Vipul r15 Fairo e45
Akshay e44
Pavan e24 Asad t14
When i run the below code
import pandas as pd
import numpy as np
df = pd.read_excel (r'C:\\Users\\Kiran\\Desktop\\Data\\Output1.xlsx')
df=pd.DataFrame(np.reshape(df.to_numpy(),(-1,2)))
df.to_excel("Output2.xlsx")
I get output as
0 1
0 Rahul e34
1 Pradeep e44
2 Azhar t54
3
4 Venkat r45
5 Akash e14
6 Vipul r15
7 Fairo e45
8 Akshay e44
9
10
11
12 Pavan e24
13 Asad t14
14
15
I need output as shown below
Rahul e34
Pradeep e44
Azhar t54
Venkat r45
Akash e14
Vipul r15
Fairo e45
Akshay e44
Pavan e24
Asad t14
Row 1 (0,1) and 1st Column (0,1,2,3,4.....15) should not appear along with how to remove empty spaces. Can anyone please guide me. Appreciate your help.
Upvotes: 0
Views: 5026
Reputation: 655
You can use the index
& header
argument in the df.to_excel()
.
Also use header=None
while reading excel into pandas DF. If you don't you will be missing an observation in the output file.
Here is the working code:
import pandas as pd
import numpy as np
df = pd.read_excel ('input.xlsx',header=None)
df=pd.DataFrame(np.reshape(df.to_numpy(),(-1,2)))
df.dropna(axis=0,inplace=True)
clean_df=df[:].astype('str').apply(lambda x: x.str.strip())
print(clean_df)
df.to_excel("Output.xlsx",index=False,header=False)
Upvotes: 1