Demaunt
Demaunt

Reputation: 1243

Python DataFrame or list for storing objects

Can I "store" instances of class in pandas/numpy Series-DataFrame/ndarray just like I do in list? Or these libraries support on built-in types (numerics, strings).

For example I have Point with x,y coordinates, and I want to store Points in Plane, that would return Point with given coordinates.

#my class
class MyPoint:

    def __init__(self, x,y):
        self.x = x
        self.y = y

    @property
    def x(self):
        return self.x

    @property
    def y(self):
        return self.y

Here I create instances:

first_point = MyClass(1,1)
second_point = MyClass(2,2)

I can store instances in some list

my_list = []
my_list.append(first_point)
my_list.append(second_point)

The problem in list is that it's indexes do not correspond to x,y properties.

Dictionary/DataFrame approach:

Plane = {"x" : [first_point.x, second_point.x], "y" : [first_point.y, second_point.y], "some_reference/id_to_point_instance" = ???}
Plane_pd = pd.DataFrame(Plane)

I've read posts, that using "id" of instance as third column value in DataFrame could cause problems with the garbage collector.

Upvotes: 29

Views: 24474

Answers (1)

Stephen Rauch
Stephen Rauch

Reputation: 49814

A pandas.DataFrame will gladly store python objects.

Some test code to demonstrate...

Test Code:

class MyPoint:
    def __init__(self, x, y):
        self._x = x
        self._y = y

    @property
    def x(self):
        return self._x

    @property
    def y(self):
        return self._y

my_list = [MyPoint(1, 1), MyPoint(2, 2)]
print(my_list)

plane_pd = pd.DataFrame([[p.x, p.y, p] for p in my_list],
                        columns=list('XYO'))
print(plane_pd.dtypes)
print(plane_pd)

Results:

[<__main__.MyPoint object at 0x033D2AF0>, <__main__.MyPoint object at 0x033D2B10>]

X     int64
Y     int64
O    object
dtype: object

   X  Y                                        O
0  1  1  <__main__.MyPoint object at 0x033D2AF0>
1  2  2  <__main__.MyPoint object at 0x033D2B10>

Notes:

Note the two object in the list are the same two objects in the dataframe. Also note the dtype for the O column is object.

Upvotes: 30

Related Questions