Web scraping python: IndexError: list index out of range

Question

The script reads a single URL from a text file and then imports information from that web page and store it in a CSV file. The script works fine for a single URL. Problem: I have added several URLs in my text file line by line and now I want my script to read first URL, do the desired operation and then go back to text file to read the second URL and repeat. Once I added the for loop to get this done, I stated facing the below error:

Traceback (most recent call last): File "C:\Users\T947610\Desktop\hahah.py", line 22, in table = soup.findAll("table", {"class":"display"})[0] #Facing error in this statement IndexError: list index out of range

f = open("URL.txt", 'r')
for line in f.readlines():
    print (line)
    page = requests.get(line)
    print(page.status_code)
    print(page.content)
    soup = BeautifulSoup(page.text, 'html.parser')
    print("soup command worked")
    table = soup.findAll("table", {"class":"display"})[0] #Facing error in this statement
    rows = table.findAll("tr")

Sheng Zhuang · Accepted Answer

If the single url input was working, maybe new input line from .txt is the problem. Try apply .strip() to the line, the line normally has whitespace at the head and tail

page = requests.get(line.strip())

Also, if soup.findall() find nothing, it will return None, which cannot be indexed. Try print the soup and check the content.

Web scraping python: IndexError: list index out of range

Answers (2)

Related Questions