Reputation:
I have an excel spreadsheet with 2 columns that I am attempting to convert to a 2d array with the following code:
#!/usr/bin/python3
import openpyxl
import sys
import os
book = openpyxl.load_workbook('contact2019.xlsx')
sheet = book.active
first_sheet = book.get_sheet_names()[0]
worksheet = book.get_sheet_by_name(first_sheet)
excel_data = [[0 for x in range(2)] for y in range(1)]
print(len(excel_data))
first = 0
cell_num = 0
for row in range(2,worksheet.max_row+1):
for column in "AB": #Here you can add or reduce the columns
cell_name = "{}{}".format(column, row)
excel_data.append(worksheet[cell_name].value)
print(excel_data)
My issue is that the data is simply being added to a 1d array sequentially, so my output of this is as follows:
['Sam Adams', '*******@gmail.com']
This is not a 2d array which I have initialized.
Input data is as follows:
Sam Adams **********@gmail.com
Sammy Adams **********@gmail.com
Samuel Adams **********@gmail.com
Samantha Adams **********@gmail.com
Sam Adams **********@gmail.com
Why is this assigning data sequentially instead of two per row?
Upvotes: 1
Views: 1966
Reputation: 62483
pandas
:read_excel
to_numpy
import pandas as pd
import numpy as np
df = pd.read_excel('test.xlsx') # change the name of the file as needed
# dataframe
name email
Sam Adams **********@gmail.com
Sammy Adams **********@gmail.com
Samuel Adams **********@gmail.com
Samantha Adams **********@gmail.com
Sam Adams **********@gmail.com
excel_data = df.to_numpy()
print(excel_data)
# Output
array([['Sam Adams', '**********@gmail.com'],
['Sammy Adams', '**********@gmail.com'],
['Samuel Adams', '**********@gmail.com'],
['Samantha Adams', '**********@gmail.com'],
['Sam Adams', '**********@gmail.com']], dtype=object)
Upvotes: 1
Reputation: 2919
Welcome to SO!
Your code below iterates through and adds each item as a separate item, hence you get everything in sequential order.
for row in range(2,worksheet.max_row+1):
for column in "AB": #Here you can add or reduce the columns
cell_name = "{}{}".format(column, row)
excel_data.append(worksheet[cell_name].value)
Rather than looping through first the rows and then all the columns, you could replace the code to only loop through the rows.
for row in range(2,worksheet.max_row+1):
excel_data.append([worksheet["A{}".format(row)].value,worksheet["B{}".format(row)].value])
By doing this you are now creating a list of 2 element lists (or a 2d array).
If you aren't set on doing it this way, you can also consider the python library pandas
which abstracts out a lot of the work and lets you work with spreadsheet like objects called DataFrames
.
Upvotes: 0