White Night
White Night

Reputation: 45

How to create a list out of excel sheet values with Python?

I'm pretty new to all this python and Excel things. I have a really big excel sheet with 1 million rows, my goal is to create a list with a few for list (I think that is what I need to do) and the problem is the follow:

I have something like this:

enter image description here

I want to add to a list all the POINT_Y and POINT_X (gps coordinates) where the objectid is the same and when the objectid changes I want to add those values to another list.

I only manage to get the data from the excel, my problem is in the construction of the lists.

import pandas
from shapely.geometry import Point

polygon_sheet =pandas.read_excel("C:...\coords.xlsx")
for row in polygon_sheet.index:
    data = (polygon_sheet['POINT_Y'][row]) + ", " + \
           (polygon_sheet['POINT_X'][row])

I'm sure I need a nested for loop but I don't know how to obtain what I want, that is something like this:

id1 = [[41.15226404, 25.41327506], [41.15221095, 25.41363158], [41.15220779, 25.41395401], [...(all coords from id1)]]

id2= [[41.11617356, 25.53488120], [41.11541559, 25.53517417], [41.11494893, 25.53531324], [...(all coords from id2]]

And so on until all ID's are iterated.

Hope I could explain what I expect correctly. Appreciate any help

Upvotes: 0

Views: 288

Answers (2)

kederrac
kederrac

Reputation: 17322

you can apply a custom function over each row in order to create a dict with your desired output:

from collections import defaultdict

id_points = defaultdict(list)

def extrct_points(row):
    o_id, point_x, point_y = row
    id_points[o_id].append([point_x, point_y])

polygon_sheet.apply(extrct_points, axis=1)

for example:

polygon_sheet = pd.DataFrame({'OBJECTID': [1, 1,  2, 2, 2, 3], 'POINT_X':[0.23, 0.123, 0.1, 0.1, 0.2, 1],
                  'POINT_Y':[0.223, 0.3123, 0.11, 0.11, 0.22, 11]})

enter image description here

from pprint import pprint

polygon_sheet.apply(extrct_points, axis=1)

pprint(id_points)

output:

defaultdict(<class 'list'>,
            {1.0: [[0.23, 0.223], [0.123, 0.3123]],
             2.0: [[0.1, 0.11], [0.1, 0.11], [0.2, 0.22]],
             3.0: [[1.0, 11.0]]})

Upvotes: 1

Kostas Charitidis
Kostas Charitidis

Reputation: 3103

You can create a dict of lists.Try something like this:

import pandas
from shapely.geometry import Point

dt = {}
polygon_sheet =pandas.read_excel("C:...\coords.xlsx")
for row in polygon_sheet.index:
    try:
        dt[polygon_sheet['ID'][row]] += [[polygon_sheet['POINT_Y'][row], polygon_sheet['POINT_X'][row]]]
    except:
        dt.update({polygon_sheet['ID'][row]: [[polygon_sheet['POINT_Y'][row], polygon_sheet['POINT_X'][row]]]})

print(dt)

Upvotes: 1

Related Questions