Kartik Rajan
Kartik Rajan

Reputation: 45

Storage of dataframes and variables defined inside a method in Python

If a class method creates a data frame within it when an object of that class calls the method, will the data for the data frame persist after the execution of the method?

Taking the code below as an example:

class some_class():
      def some_method(self):
            some_data = pd.DataFrame({"a":[1,2,3,4],
                                      "b":[5,6,7,8]
                                      })
            return some_data

a = some_class()
b = a.some_method()

After the execution of the call to a.some_method() will the dataframe be stored in the object?

I want to be able to create multiple objects and use them to return data based on the methods defined in those objects but I'm concerned that if the object stores the data as well then in effect I'll be storing the same data twice (in data frame b and in the object an in the example above).

Upvotes: 2

Views: 800

Answers (2)

davidA
davidA

Reputation: 13664

If you want to store a value inside a class, then a method must assign to self. For example:

class some_class():
      def some_method(self):
            self.some_data = pd.DataFrame({"a":[1,2,3,4],
                                           "b":[5,6,7,8]
                                          })
            return self.some_data

a = some_class()
b = a.some_method()

This will store a "label" to the data within your instance of some_class (which you should capitalize as SomeClass btw if you want to follow the popular convention) with the label some_data. The variable b is also an alias to this data - both a.some_data and b refer to the exact same data. There is no copy.

This is useful and saves memory but you need to be aware that you're working with labels (references) to the same data. If you want a.some_data and b to be separate instances of data, you'll need to explicitly copy the data.

Python variables behave differently to many other popular languages. The name of the variable, e.g. b, is really just a label attached to some value. Therefore if you assign c = b, you haven't copied the data, you've simply assigned a new label to the original value. For immutable types like primitive numeric types, this isn't much different to copying the value, but for more complex types (lists, dicts, data frames, etc) you need to be aware that you're dealing with labels.

Upvotes: 2

U13-Forward
U13-Forward

Reputation: 71600

The class you do won't, since you have self while there isn't __init__, so do:

class some_class():
      def some_method():
            some_data = pd.DataFrame({"a":[1,2,3,4],
                                      "b":[5,6,7,8]
                                      })
            return some_data

print(some_class.some_method())

Output:

   a  b
0  1  5
1  2  6
2  3  7
3  4  8

Upvotes: 1

Related Questions