ValueError: unsupported format character 'a' (0x61) at index 79

Question

I am trying to scrape the data from a website using beautiful soup4 and python. Here is my code

from bs4 import BeautifulSoup
import urllib2
i = 0
for i in xrange(0,38):
    page=urllib2.urlopen("http://www.sfap.org/klsfaprep_search?page={}&type=1&strname=&loc=&op=Lancer%20la%20recherche&form_build_id=form-72a297de309517ed5a2c28af7ed15208&form_id=klsfaprep_search_form" %i) 
    soup = BeautifulSoup(page.read())
    for eachuniversity in soup.findAll('div',{'class':'field-item odd'}):
        print ''.join(eachuniversity.findAll(text=True)).encode('utf-8')
    print ',
'
i= i+ 1

I think the problem is in the URL that I have given and in the increment statement. I am able to scrape page by page. But only when I give the xrange.

falsetru · Accepted Answer

Reason of the `ValueError`

You're mixing {} formatting with % formatting.

>>> '{}%20la' % 1
Traceback (most recent call last):
  File "", line 1, in 
ValueError: unsupported format character 'a' (0x61) at index 6
>>> '{}%20la'.format(1)
'1%20la'

I recommend you to use {} formatting, because in URL, there are multiple %s.

page=urllib2.urlopen("http://www.sfap.org/klsfaprep_search?page={}&type=1&strname=&loc=&op=Lancer%20la%20recherche&form_build_id=form-72a297de309517ed5a2c28af7ed15208&form_id=klsfaprep_search_form".format(i))

Complete code

You don't need i = 0 and i = i + 1 because for i in xrange(0,38) take care of it.

import urllib2 # Import standard library module first. (PEP-8)

from bs4 import BeautifulSoup

for i in xrange(0,38):
    page = urllib2.urlopen("http://www.sfap.org/klsfaprep_search?page={}&type=1&strname=&loc=&op=Lancer%20la%20recherche&form_build_id=form-72a297de309517ed5a2c28af7ed15208&form_id=klsfaprep_search_form" .format(i))
    soup = BeautifulSoup(page.read())
    for eachuniversity in soup.findAll('div',{'class':'field-item odd'}):
        print ''.join(eachuniversity.findAll(text=True)).encode('utf-8')
    print ',
'

ValueError: unsupported format character 'a' (0x61) at index 79

Answers (1)

Reason of the `ValueError`

Complete code

Related Questions

ValueError: unsupported format character &#39;a&#39; (0x61) at index 79

Answers (1)

Reason of the ValueError

Complete code

Related Questions

ValueError: unsupported format character 'a' (0x61) at index 79

Reason of the `ValueError`