Reputation:
My excel file looks like this:
X_0 X_1 y
0 2917.211242 3289.522533 0
1 1888.937716 781.528356 0
2 4188.521414 1554.476261 0
3 8145.555339 9804.066728 0
4 9584.488981 6176.337189 0
I'm trying to read the data to look like this (i.e. each row in a list within a big list, don't mind the actual entries in the example):
[[3.393533211,2.331273381,0],
[3.110073483,1.781539638,0],
[1.343808831,3.368360954,0],
[3.582294042,4.67917911,0],
[2.280362439,2.866990263,0],
[7.423436942,4.696522875,1],
[5.745051997,3.533989803,1],
[9.172168622,2.511101045,1],
[7.792783481,3.424088941,1],
[7.939820817,0.791637231,1]]
What I've tried so far is this:
import pandas as pd
df = pd.read_excel("nameOfMyDatabase.xlsx")
list1 = list(df['X_0'])
list2 = list(df['X_1'])
list3 = list(df['y'])
print(list1)
print(list2)
print(list3)
And this is the result I'm getting:
[2917.2112422149, 1888.93771568063, 4188.52141357231, 8145.55533888016, 9584.48898099062, 4039.63375666484, ....]
[3289.52253268053, 781.528355771809, 1554.47626074928, 9804.06672844197, 6176.33718886522, 167.607142550757, ....]
[0, 0, 0, 0, 0, 0, ....]
So not quite what i want yet but trial and error is not working for me. Any ideas?
Upvotes: 0
Views: 2652
Reputation: 1930
This is my suggestion using numpy and openpyxl:
import openpyxl
import numpy as np
wb = openpyxl.load_workbook('test.xlsx')
ws = wb['Sheet1']
lst = []
for v in ws.values:
lst.append(v)
print(lst)
values = = np.array(lst)
Upvotes: 0
Reputation: 41
You can try:
import pandas as pd
df = pd.read_excel("nameOfMyDatabase.xlsx")
data = df.values.tolist()
Upvotes: 0
Reputation: 7509
If you just want to parse data from Excel and don't care about using the pandas DataFrame itself, you should be using openpyxl
.
For example, for a spreadsheet like this:
and assuming that your desired spreadsheet is the first/only sheet in the workbook, you could do:
import openpyxl
wb = openpyxl.load_workbook("nameOfMyDatabase.xlsx")
ws = wb.active
cell_range = 'A2:C5'
data = [[cell.value for cell in row] for row in ws[cell_range]]
print(data)
# output: [[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
Upvotes: 1
Reputation: 94
You could try using
big_list = []
for a, b in zip(df["x_0"], df["x_0"]):
list = []
list.append(a)
list.append(b)
big_list = big_list.append(list)
Upvotes: 0