Reputation: 669
Python beginner here. Let's say I have three methods for scraping websites. Let's call them scrape_site_a
, scrape_site_b
, and scrape_site_c
. I want to run each of these but I'd like to define them in such a way that I can call them dynamically without calling each by name. Ideally I'd like to just load all modules in a directory and call the same method on each of them. My attempt so far is the following:
site_a.py
def scrape():
# scrape the site
site_b.py
def scrape():
# scrape the site
site_c.py
def scrape():
# scrape the site
I have the __init__.py
setup such that I can do the following:
scrape.py
from sites import *
site_a.scrape()
site_b.scrape()
site_c.scrape()
I would like to do something like:
for site in sites:
site.scrape()
I realize that there is a fundamental programming concept I'm not understanding here and I have two questions:
Upvotes: 0
Views: 84
Reputation: 1161
You'll want to use the inspect module for stuff like this.
import inspect
modules = [mod for mod in globals() if inspect.ismodule(eval(mod))]
Will give you everything that's a module in your namespace. You should be able to see how to modify this to be more specific, if you want. The trick is running eval to turn a string of a name into a reference to some object, which may be a module.
Upvotes: 0
Reputation: 184345
The following scans a given directory, loads each .py
file in it, and calls the module's scrape
method if it exists.
from os import listdir
from os.path import join
scraper_dir = "./scrapers"
for scraper_name in listdir(scraper_dir):
if scraper_name.endswith(".py"):
with open(join(scraper_dir, scraper_name)) as scraper_file:
scraper_globals = {} # this will hold scraper's globals
scraper_module = exec(scraper_file.read(), scraper_globals)
if "scrape" in scraper_globals: # we have a scrape method
scrape_method = scraper_globals["scrape"]
callable(scrape_method) and scrape_method() # call it
Upvotes: 1
Reputation: 114038
from sites import site_a,site_b,site_c
sites = [site_a,site_b,site_c]
for site in sites:
site.scrape()
I guess might be what you are asking for
from sites import *
for item in globals():
if item.startswith("site_") and hasattr(globals()[item],'scrape'):
globals()[item].scrape()
introspection like this is kinda dicey though ... reader beware
Upvotes: 0