jodh singh
jodh singh

Reputation: 25

Why do I need to specify the size of this list, else it gives list index out of range error

I am trying to parse a list of urls from a webpage. I did the following things:

  1. Got a list of all "a" tags.
  2. Used a for loop to get("href")
  3. While looping, I kept assigning the get value to a new empty list called links

But I kept getting a index out of range error. I thought it might be because of the way I was incrementing the index of links, but I am sure that is not the case. This is the error prone code:

import urllib
import bs4
url = "http://tellerprimer.ucdavis.edu/pdf/"
response = urllib.urlopen(url)
webpage = response.read()
soup = bs4.BeautifulSoup(webpage, 'html.parser')
i = 0
links = []

for tags in soup.find_all('a'):
    links[i] = str(tags.get('href'))
    i +=1
print i, links

I gave links a fixed length and it fixed it, like so:

links = [0]*89 #89 is the length of soup.find_all('a')

I want to know what was causing this problem.

Upvotes: 2

Views: 52

Answers (2)

Robby Cornelissen
Robby Cornelissen

Reputation: 97140

The list is initially empty, so you're trying to assign values to non-existing index locations in the list.

Use append() to add items to a list:

links = []

for tags in soup.find_all('a'):
     links.append(str(tags.get('href')))

Or use map() instead:

links = map(lambda tags: str(tags.get('href')), soup.find_all('a'))

Or use a list comprehension:

links = [str(tags.get('href')) for tags in soup.find_all('a')]

Upvotes: 1

Andy
Andy

Reputation: 50550

You are attempting to assign something to a non-existent index. When you create links, you create it as an empty list.

Then you do links[i], but links is empty, so there is no ith index.

The proper way to do this is:

links.append(str(tags.get('href')))

This also means that you can eliminate your i variable. It's not needed.


for tags in soup.find_all('a'):
    links.append(str(tags.get('href')))
print links

This will print all 89 links in your links list.

Upvotes: 4

Related Questions