noskule
noskule

Reputation: 53

Global variable as class init argument

I would like to add rows to a global dataframe from within a different class instances. Can I somehow give the global dataframe as argument when creating the class instance. In the code below the dataframe is a local copy while I would like to change the global df.

I could give the global dataframe as argument in the add_row function directly and it would work but I would like to avoid that.

I'm not sure if this is the right way to do it anyway. My goal is that I can change the same dataframe from within different classes.

import pandas as pd

history_1 = pd.DataFrame()

class ClassA:
    def __init__(self,  history):
        self.history = history

    def add_row(self, row):
        self.history = pd.concat([self.history, pd.DataFrame([row])])


class ClassB:
    def __init__(self,  history):
        self.history = history

    def add_row(self, row):
        self.history = pd.concat([self.history, pd.DataFrame([row])])


class_a = ClassA(history_1)
new_row = {'r1':1, 'r2':2, 'r3':3}
class_a.add_row(new_row)

class_b = ClassB(history_1)
new_row = {'r1':1, 'r2':2, 'r3':3}
class_b.add_row(new_row)

Upvotes: 1

Views: 307

Answers (1)

I'm not sure what your use case is, but concat returns a new object, and is not an in-place operation. To modify the array history_1 in-place, you can try to use the following approach:

import pandas as pd

history_1 = pd.DataFrame()

class ClassA:
    def __init__(self,  history):
        self.history = history

    def add_row(self, row):
        addition = pd.DataFrame([row])
        self.history[addition.columns] = addition


class ClassB:
    def __init__(self,  history):
        self.history = history

    def add_row(self, row):
        addition = pd.DataFrame([row])
        self.history[addition.columns] = addition


class_a = ClassA(history_1)
new_row = {'r1':1, 'r2':2, 'r3':3}
class_a.add_row(new_row)

class_b = ClassB(history_1)
new_row = {'r1':1, 'r2':2, 'r3':3}
class_b.add_row(new_row)

print(history_1)
# >>>    r1  r2  r3
# >>> 0   1   2   3

Note that since you're trying to add rows, and not columns, you use the following operation:

self.history.append(addition)

EDIT: Upon reviewing the question a bit more, the .append() function is deprecated. In this case, if you want to add rows to a dataframe, you can use the following approach:

import pandas as pd

cols = ['r1', 'r2', 'r3']
history_1 = pd.DataFrame(columns = cols)

class ClassA:
    def __init__(self,  history):
        self.history = history

    def add_row(self, row):
        self.history.loc[self.history.shape[0]] = [row.get(i) for i in cols]

class ClassB:
    def __init__(self,  history):
        self.history = history

    def add_row(self, row):
        self.history.loc[self.history.shape[0]] = [row.get(i) for i in cols]

class_a = ClassA(history_1)
new_row = {'r1':1, 'r2':2, 'r3':3}
class_a.add_row(new_row)

class_b = ClassB(history_1)
new_row = {'r1':1, 'r2':2, 'r3':3}
class_b.add_row(new_row)

print(history_1)
# >>>    r1  r2  r3
# >>> 0   1   2   3
# >>> 1   1   2   3

Final edit: Apparently self.history[self.history.shape[0]] is faster than self.history[len(self.history)] according to How to add an extra row to a pandas dataframe

Upvotes: 2

Related Questions