Reputation: 23
Let's say I have an excel file as below:
If you are not able to view the image link, the excel file is basically like this:
Name XXX
Age XXX
Gender XXX
Job XXX
Location XXX
... ...
... ...
(Repeat the list of "Name, Age, Gender, Job, Location".) There are at least 1000 or more lists in the file. "XXX" represent the content correspond to each field of the list.
I searched online about row-to-column conversion, and I found the paste function called 'transpose' within the excel. However, it is very simple as it converts all the rows into columns.
What I want to do is to have only 5 columns, with column names of Name, Age, Gender, Job and Location.
Upvotes: 2
Views: 5798
Reputation: 6098
import numpy as np
import pandas as pd
import openpyxl
dfs = pd.read_excel('input.xlsx', sheet_name=None,header=None)
tester=dfs['Sheet1'].values.tolist()
keys = list(zip(*tester))[0]
seen = set()
seen_add = seen.add
keysu= [x for x in keys if not (x in seen or seen_add(x))]
values = list(zip(*tester))[1]
a = np.array(values).reshape(int(len(values)/len(keysu)),len(keysu))
list1=[keysu]
for i in a:
list1.append(list(i))
df=pd.DataFrame(list1)
df.to_excel('output.xlsx',index=False,header=False)
input.xlsx
Output.xlsx
Upvotes: 1