Samuel Håkansson
Samuel Håkansson

Reputation: 53

Function won't loop

I've made a piece of code which works fine with print but fails when I made a function of it and tried to return it. Here's the original code:

import requests
from bs4 import BeautifulSoup
import wikipedia

source_code = requests.get('http://en.wikipedia.org/wiki/IBM')
plain_text = source_code.text
plain_text = plain_text[:plain_text.find('id="toc"')]
soup = BeautifulSoup(plain_text)

for div in soup.findAll('a'):
    if div.parent.name == 'p':
        href = div.get('href')
        href = href.replace(',', '')
        href = href.replace('-', ' ')
        href = href.replace('(', '')
        href = href.replace(')', '')
        href = href.replace('_', ' ')

        print (href[6:])
        href = href.replace(' ', '_')
        href = href.replace(' ^ ', '')
        try:
            print(wikipedia.summary(href[6:]))
        except wikipedia.exceptions.DisambiguationError as e:
            print (e.options)

which formats the text and gives me a title and the summary of a wikipedia page and all the summaries of the links in the original summary, which is exactly what I want. Unfortunately, this needs to be part of a bigger program and therefore i made a function (maybe I should do it in another way?) it looks like this:

import requests
from bs4 import BeautifulSoup
import wikipedia

source_code = requests.get('http://en.wikipedia.org/wiki/IBM')
plain_text = source_code.text
plain_text = plain_text[:plain_text.find('id="toc"')]
soup = BeautifulSoup(plain_text)

def ELS():
    for div in soup.findAll('a'):
        if div.parent.name == 'p':
            href = div.get('href')
            href = href.replace(',', '')
            href = href.replace('-', ' ')
            href = href.replace('(', '')
            href = href.replace(')', '')
            href = href.replace('_', ' ')

            return href[6:]
            href = href.replace(' ', '_')
            href = href.replace(' ^ ', '')
            try:
                return wikipedia.summary(href[6:])
            except wikipedia.exceptions.DisambiguationError as e:
                return e.options

print (ELS())

but for some reason, it doesn't loop and just prints the first title and then breaks, maybe it's an easy problem and just something i've missed

Upvotes: 0

Views: 136

Answers (3)

Martijn Pieters
Martijn Pieters

Reputation: 1121754

return immediately exits the function.

Collect the information in a list and return that:

def ELS():
    results = []
    for div in soup.findAll('a'):
        if div.parent.name == 'p':
            href = div.get('href')
            href = href.replace(',', '')
            href = href.replace('-', ' ')
            href = href.replace('(', '')
            href = href.replace(')', '')
            href = href.replace('_', ' ')

            href = href.replace(' ', '_')
            href = href.replace(' ^ ', '')
            try:
                results.append((href[6:], wikipedia.summary(href[6:])))
            except wikipedia.exceptions.DisambiguationError as e:
                results.append((href[6:], e.options))
    return results

You can then loop over the results; each entry is a tuple with the processed href value and the wikipedia.summary() output or the exception e.options attribute. This then lets you further reuse this information in other code.

Upvotes: 1

Yuri Malheiros
Yuri Malheiros

Reputation: 1410

You just replace print with return, and your function behaviour now has a problem, because the function ends its execution when the command return is called.

Try something like this:

def ELS():
    output = []
    for div in soup.findAll('a'):
        if div.parent.name == 'p':
            href = div.get('href')
            href = href.replace(',', '')
            href = href.replace('-', ' ')
            href = href.replace('(', '')
            href = href.replace(')', '')
            href = href.replace('_', ' ')

            output.append(href[6:])
            href = href.replace(' ', '_')
            href = href.replace(' ^ ', '')
            try:
                output.append(wikipedia.summary(href[6:]))
            except wikipedia.exceptions.DisambiguationError as e:
                output.append(e.options)

    return "\n".join(output)

Upvotes: 1

ACimander
ACimander

Reputation: 1979

You're returning out of your function, therefore breaking the loop. You need to add your search results to a list or a dict and return it after your loop.

Upvotes: 0

Related Questions