James Cook
James Cook

Reputation: 344

Using Classes in Pandas

Im trying to write better more easy to read code and Ive begun starting to use classes.. Confusing thus far, but I can see the positives.. That said im simply trying to merge 2 dataframes...

Previously I've achieved this by using...

import pandas as pd

path = 'path/to/file.xlsx'

df  = pd.read_excel(path, 'sheet1')
df2 = pd.read_excel(path, 'sheet2')

df3 = df.merge(df2, how = 'left, on = 'column1')

Trying to implement using classes I have this thus far. Which could be incorrect.....?

import pandas as pd 

path = 'path/to/file.xlsx'

class CreateOpFile:
    def __init__(self, path):
        self.df = pd.read_excel(path, 'sheet1')
        self.df2 = pd.read_excel(path, 'sheet2')

    def MergeDataFrames(self):
        pd.merge(self.df, self.df2, how = 'left', on= 'column1')

So im confused as to how I create a new variable, lets say df3 outside of the class CreateOpFile as I have done using df3 = df.merge(df2, how = 'left, on = 'column1') in the first method ?

Upvotes: 1

Views: 1996

Answers (1)

Muhteva
Muhteva

Reputation: 2832

One way to do it is to return the new, merged df and assign it to the df3 outside of the class.

import pandas as pd 

path = 'path/to/file.xlsx'

class CreateOpFile:
    def __init__(self, path):
        self.df = pd.read_excel(path, 'sheet1')
        self.df2 = pd.read_excel(path, 'sheet2')

    def MergeDataFrames(self):
        return pd.merge(self.df, self.df2, how = 'left', on= 'column1')

df3 = CreateOpFile(path).MergeDataFrames()

Btw, according to the naming conventions mentioned in PEP-8, method names should consist of lowercase letters and underscores to separate words. Therefore merge_data_frames() seems to be a better naming.

Upvotes: 2

Related Questions