CreamStat
CreamStat

Reputation: 2185

Different ouput in similar code - Webscraping with Python

In this link, I try to count all the numbers that appear as links above the table of the webpage. Check the link to have a better idea. I have two codes that are very similar, but just the first code gives the expected output. So, what is wrong with my second code?

import urllib2
from bs4 import BeautifulSoup

soup = BeautifulSoup(urllib2.urlopen("http://www.admision.unmsm.edu.pe/admisionsabado/A/011/0.html"))
c=[]
for n in soup.find_all('center'):
    for b in n.find_all('a')[1:]:
        c.append(b.text)

t = len(c) / 2

print t

The result is 41. In the webpage there 41 numbers which appear as links above the table of the webpage, so the output is right.

In the wrong code, I define a function which its input is a subset of the url. The code is below:

import urllib2
from bs4 import BeautifulSoup


def record(part):
    soup = BeautifulSoup(urllib2.urlopen("http://www.admision.unmsm.edu.pe/admisionsabado".format(part)))
    c=[]
    for n in soup.find_all('center'):
        for b in n.find_all('a')[1:]:
            c.append(b.text)

    t = len(c)/2
    print t

As you see the method for counting the numbers are the same. So, I run the function:

record('/A/011/0.html')

Unfortunately, the output is 0.

Upvotes: 0

Views: 36

Answers (1)

vivekagr
vivekagr

Reputation: 1836

Inside the function, you are formatting the URL string with the passed in parameter, but the format string didn't have the placeholder {} in it to place the value at. Here it is with it.

"http://www.admision.unmsm.edu.pe/admisionsabado{}".format(part)

Upvotes: 1

Related Questions