Reputation: 5
I don't know if the title is very accurate.
I have 5 methods that are webscraping different websites. Each function looks something like this:
def getWebsiteData1(last_article):
ty = datetime.today()
ty_str = ty.strftime('%d.%m.%Y')
url = 'http://www.website.com/news'
r = requests.get(url)
html = r.text
soup = BeautifulSoup(html, 'html.parser')
articles = soup.findAll("div", {"class": "text"})[:15]
data = list()
for article in articles:
article_data = dict()
if article.find("a").get('href') == last_article:
return data
else:
article_data["link"] = article.find("a").get('href')
article_data["title"] = article.find("a").get_text()
data.append(article_data)
return data
So each function returns a list of dictionaries.
I have another function that calls this function:
def CreateArticle(website_number, slug):
website = Website.objects.get(slug=slug)
last_article = website.last_article
data = getWebsiteData1(last_article) # here i want to do something like
data = website_number(last_article) # but ofcourse this doesnt work
if len(data) == 0:
return "No news"
else:
for i in data:
article = Article(service=service)
article.title = i['title']
article.url = i['link']
article.code = i['link']
article.save()
service.last_article = data[0]['link']
service.save(update_fields=['last_article'])
return data[0]['link']
I want to be able to call CreateArticle(website_number) and tell this function which getWebsiteData
function it should call, so I could have only one CreateArticle
function and not for each webscraper function another CreateArticle function.
I hope my question is clear :D
Upvotes: 0
Views: 50
Reputation: 9116
In python functions are first class, and can be passed as arguments to other functions.
def a():
print("x")
def b(some_function):
some_function()
then
b(a)
will print "x" as a is called within b.
So you can determine what function you want to use, then pass it in to be used.
Upvotes: 2