Confusion over how to share and modify class level data between instances.

Question

I don't understand how 'static' class data works in Python. I'm attempting to replicate a pattern I used often in Java where I create a central faux-database class, and give instances of it to any classes that need it.

However, here's where I'm lost with Python, it seems like anytime I try to access the 'static' variable, I create an instance level one. It's very confusing.

Here's the code:

import csv 

class Database(object):
    data = list()
    def __init__(self):
        pass

    def connect(self, location=None):
        path = '.\resources\database\'
        with open(path + 'info.csv', 'rU') as f:
            csv_data = csv.reader(f, delimiter=',')
            for row in csv_data:
                self.data.append(Entry(*row))

As you can see, pretty straight forward. I declare data outside of the init, which I believe makes it class level. I then loop through a csv file and append each row's cell fields to the data list.

Here's where I get confused. The first thing I wanted to do was slice of the first entry in the data list (this just had the column named from the excel sheet). But in doing this, it seems to create an instance level version of data rather than modifying the class level data that I want.

So, I modified the code as follows

import csv 

class Database(object):
    data = list()
    def __init__(self):
        pass

    def connect(self, location=None):
        path = '.\resources\database\'
        with open(path + 'info.csv', 'rU') as f:
            csv_data = csv.reader(f, delimiter=',')
            for row in csv_data:
                self.data.append(Entry(*row))
            self.data = self.data[1:]  ## <--- NEW LINE HERE

As you can see, I tried to slice it as I would any other list, but in doing so, it makes (seemingly) a local version of data

In my main(), I have the following code to test out some of the attributes:

def main():
    settings = Settings()

    db = csv_loader.Database()
    db.connect(settings.database)
    print 'gui.py:', db.data[0].num
    print 'gui.py:', db.data[0].name
    print 'gui.py:', len(db.data)

This behaves just as expected. It has 143 entries, because of the previous slice operation.

Now, I have another class which has an instance of Database() but the slice had no effect.

self.db = Database()
self.settings = Settings()
print 'tabOne.py:', self.db.data[0].conf_num
print 'tabOne.py:', self.db.data[0].conf_name
print len(self.db.data)

The printout from this class shows that there are 144 elements in the list -- so the slice operation didn't actually modify the class level data.

What am I missing? Am I attempting to modify the class level variable incorrectly or something?

mgilson · Accepted Answer

You're basically right about how to go about doing this, however, list slicing creates a copy, so the line:

self.data = self.data[1:]

creates a copy of the data and then assigns it to a new instance attribute.

Here's how all of this works: when an attribute is looked up for an instance, python first looks in the instances's __dict__ for the attribute. If it isn't in the instance's __dict__, python next looks in the class's __dict__ and then parent class's __dict__s (following the method resolution order). So, when you do first start accessing self.data (to append to it), you don't find data in the instance's __dict__, so python is using data from the class's __dict__. But, when you explicitly assign to self.dict, you're all of a sudden adding an entry into the instance's __dict__ which will be used from that point on.

There are a couple of workarounds:

Database.data = self.data[1:]

would work just fine, or you could use slice assignment to modify the self.data list in place:

self.data[:] = self.data[1:]

Or any other method which modifies the list in place:

self.data.pop(0)

A final option, which is probably the most explicit of the bunch is to change connect to a classmethod:

@classmethod
def connect(cls, location=None):
    path = '.\resources\database\'
    with open(path + 'info.csv', 'rU') as f:
        csv_data = csv.reader(f, delimiter=',')
        for row in csv_data:
            cls.data.append(Entry(*row))
        cls.data = cls.data[1:]

Now the first argument to the method is the class, not the instance and so we've changed the customary self to an equally customary cls (although I've seen klass as well). This method can be invoked from an instance or from the class itself:

database_instance = Database()
database_instance.connect()

Database.connect()

In the comments, there is mention of using a module for this sort of thing. A module allows state to transported throughout your program very simply as well and they end up acting a lot like a singleton -- in fact they're often recommended for python instead of singletons:

"""
Module `Database` (found in Database.py)
"""
data = []

def connect(self, location=None):
    global data
    path = '.\resources\database\'
    with open(path + 'info.csv', 'rU') as f:
        csv_data = csv.reader(f, delimiter=',')
        for row in csv_data:
            data.append(Entry(*row))
        data = data[1:]

Now in another module you just:

import Database
Database.connect()
print Database.data

etc. etc.

Confusion over how to share and modify class level data between instances.

Answers (1)

Related Questions