Reputation: 634
Simple question, I've searched to no avail. Say I have a file "funcs.py", in it there's a function I want to call into my current script. The function uses another library (e.g. pandas), where do I import that library? What's the convention?
Do I put it inside the function in funcs.py?
#funcs.py
def make_df():
import pandas as pd
return pd.DataFrame(index=[1,2,3],data=[1,2,3])
Do I put it outside the function in funcs.py?
#funcs.py
import pandas as pd
def make_df():
return pd.DataFrame(index=[1,2,3],data=[1,2,3])
Or do I put it in the current script I'm using?
#main.py
import pandas as pd
from funcs import make_df
df = make_df()
Thanks and kind regards.
Upvotes: 6
Views: 7051
Reputation: 599
In Python each file is a module. Each module has its own namespace - its own set of variables. Each function also has its own local namespace.
When you use the name pd
in a function defined in the module func
, it will first look for the local variable pd
in the function - if it doesn't exist,
it will look for it in the namespace of its module, func
. It will not look for it in the module main
, even if the code calling the function is in main.py
This is known as lexical scoping - the rule is that variables are looked up close to where the code is defined, not where it is used. Some languages do look up variables close to where the code is used, it's known as dynamic scoping- in one of these languages something like your solution #3 would work, but most languages including Python follow lexical scoping rules, so it won't work.
So pandas has to be imported in funcs.py. main.py doesn't have to import or even
know anything about pandas to use make_df
.
If you import pandas at the top of func.py, then when you import the module func
from main.py, the line import pandas as pd
at the top of func.py will be executed, the pandas module will be loaded, and a reference to it will be created in func
bound to the name pd
. There is no need to re-import it in main.py.
If you do re-import pandas in main.py, Python will be smart enough not to reload the entire module just because you imported it in two places, it will just give you a reference to the already loaded pandas module.
Putting the import in the body of the function will work but it's not considered good practice, unless you have a really good reason to do so. Normally imports go at the top of the file where they are used.
Upvotes: 7
Reputation: 6109
#3 wouldn't work. In most cases, #2 is the preferred option (the main exception would be if the library is a large (slow to import) library that's only used by that function). You might also want to consider one of these options (for optional dependencies):
#funcs.py
try:
import pandas as pd
except ImportError:
pass
def make_df():
return pd.DataFrame(index=[1,2,3],data=[1,2,3])
or
#funcs.py
try:
import pandas as pd
except ImportError:
pass
if pd is not None:
def make_df():
return pd.DataFrame(index=[1,2,3],data=[1,2,3])
Upvotes: 2
Reputation: 5955
Straight from the source at https://www.python.org/dev/peps/pep-0008/?#imports
Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.
Best practice is to do all imports on the first few lines of your script file, before doing any other coding
Upvotes: 0
Reputation: 3465
The best practice is to include at the beginning of your code in funcs.py
There is no need and you should not include pandas
in your main.py
.
Basically when you use import pandas
pandas
library becomes part of your code.
Upvotes: 0