Reputation: 130
I am new to python and I am trying to pass an argument (dataframe) to a function and change value of the argument (dataframe) by reading an excel file. (Assume that I have imported all the necessary files)
I have noticed that python does not pass the argument by reference here and I end up not having the dataframe initialized/changed.
I read that python passes by object-reference and not by value or reference. However, I do not need to change the same dataframe.
The output is : class 'pandas.core.frame.DataFrame'>
from pandas import DataFrame as df
class Data:
x = df
@staticmethod
def import_File(df_name , file):
df_name = pd.io.excel.read_excel(file.replace('"',''), sheetname='Sheet1', header=0, skiprows=None, skip_footer=0, index_col=None, parse_cols=None, parse_dates=True, date_parser=True, na_values=None, thousands=None, convert_float=True, has_index_names=None, converters=None, engine=None )
def inputdata():
Data.import_File(Data.x,r"C:\Users\Data\try.xlsx")
print(Data.x)
Upvotes: 1
Views: 5614
Reputation: 161
You seem to be doing a lot of things the hard way. I'll try to simplify it while conforming to standard patterns of use.
# Whatever imports you need
import pandas as pd
# Static variables and methods should generally be avoided.
# Change class and variable names to whatever is more suitable.
# Names should be meaningful when possible.
class MyData:
# Load data in constructor. Could easily do this in another method.
def __init__(self, filename):
self.data = pd.io.excel.read_excel(filename, sheetname='Sheet1')
def inputData():
# In my experience, forward slashes work just fine on Windows.
# Create new MyData object using constructor
x = MyData('C:/Users/Data/try.xlsx')
# Access member variable from object
print(x.data)
Here's the version where it loads in a method rather than the constructor.
import pandas as pd
class MyData:
# Constructor
def __init__(self):
# Whatever setup you need
self.data = None
self.loaded = False
# Method with optional argument
def loadFile(self, filename, sheetname='Sheet1')
self.data = pd.io.excel.read_excel(filename, sheetname=sheetname)
self.loaded = True
def inputData():
x = MyData()
x.loadFile('C:/Users/Data/try.xlsx')
print(x.data)
# load some other data, using sheetname 'Sheet2' instead of default
y = MyData()
y.loadFile('C:/Users/Data/tryagain.xlsx', 'Sheet2')
# can also pass arguments by name in any order like this:
# y.loadFile(sheetname='Sheet2', filename='C:/Users/Data/tryagain.xlsx')
print(y.data)
# x and y both still exist with different data.
# calling x.loadFile() again with a different path will overwrite its data.
The reason why it doesn't save in your original code is because assigning values to argument names never changes the original variable in Python. What you can do is something like this:
# Continuing from the last code block
def loadDefault(data):
data.loadFile('C:/Users/Data/try.xlsx')
def testReference():
x = MyData()
loadDefault(x)
# x.data now has been loaded
print(x.data)
# Another example
def setIndex0(variable, value):
variable[0] = value
def testSetIndex0():
v = ['hello', 'world']
setIndex0(v, 'Good morning')
# v[0] now equals 'Good morning'
print(v[0])
But you can't do this:
def setString(variable, value):
# The only thing this changes is the value of variable inside this function.
variable = value
def testSetString():
v = 'Start'
setString(v, 'Finish')
# v is still 'Start'
print(v)
If you want to be able to specify the location to store a value using a name, you could use a data structure with indexes/keys. Dictionaries let you access and store values using a key.
import pandas as pd
class MyData:
# Constructor
def __init__(self):
# make data a dictionary
self.data = {}
# Method with optional argument
def loadFile(self, storename, filename, sheetname='Sheet1')
self.data[storename] = pd.io.excel.read_excel(filename, sheetname=sheetname)
# Access method
def getData(self, name):
return self.data[name]
def inputData():
x = MyData()
x.loadFile('name1', 'C:/Users/Data/try.xlsx')
x.loadFile('name2', 'C:/Users/Data/tryagain.xlsx', 'Sheet2')
# access Sheet1
print(x.getData('name1'))
# access Sheet2
print(x.getData('name2'))
If you really want the function to be static, then you don't need to make a new class at all. The main reason for creating a class is to use it as a reusable structure to hold data with methods specific to that data.
import pandas as pd
# wrap read_excel to make it easier to use
def loadFile(filename, sheetname='Sheet1'):
return pd.io.excel.read_excel(filename, sheetname=sheetname)
def inputData():
x = loadFile('C:/Users/Data/try.xlsx')
print(x)
# the above is exactly the same as
x = pd.io.excel.read_excel('C:/Users/Data/try.xlsx', sheetname='Sheet1')
print(x)
Upvotes: 5
Reputation: 5324
In your code df
is a class object. To create an empty data frame you need to instantiate it. Instantiating classes in Python uses function notation. Also, we don't need to pass the default parameters when we read the excel file. This will help the code look cleaner.
Also, we don't need to pass the default parameters when we read the excel file. This will help the code look cleaner.
from pandas import DataFrame as df
class Data:
x = df()
@staticmethod
def import_File(df_name, file):
df_name = pd.io.excel.read_excel(file.replace('"',''), sheetname='Sheet1')
When you pass Data.x
to import_File()
, df_name
will refer to the same object as Data.x
, which in this case is an empty dataframe. However, when you assign pd.io.excel.read_excel(file)
to df_name
then the connection between df_name
and the empty dataframe is broken, and df_name
now refers to the excel dataframe. Data.x
has undergone no change during this process so it is still connected to for the empty data frame object.
A simpler way to see this with strings:
x = 'red'
df_name = x
We can break the df_name
connection between string object 'red' and form a new one with object 'excel`.
df_name = 'excel'
print(x)
'red'
However, there's a simple fix for Data.x
to return the excel dataframe.
from pandas import DataFrame as df
class Data:
x = df()
@staticmethod
def import_File(file):
Data.x = pd.io.excel.read_excel(file.replace('"',''), sheetname='Sheet1')
def inputdata():
Data.import_File(r"C:\Users\Data\try.xlsx")
print(Data.x)
However, I don't recommend using staticmethods, and you should include a constructor in your class as the other answer has recommended.
Upvotes: 3