jcruzer
jcruzer

Reputation: 117

Python Embedded For range Loops

I would like the following commands to grab the date from the address in this range but I can't seem to get it to run more than once. I am using Python 3. As you can see below the the url for the site is appended with i as to be read http://zinc.docking.org/substance/10 ; http://zinc.docking.org/substance/11 ... and so on. Here is the code:

import bs4 as bs
import urllib.request
site = "http://zinc.docking.org/substance/"
for i in range(10, 16): 
    site1 = str("%s%i" % (site, i))
    sauce = urllib.request.urlopen(site1).read()
    soup = bs.BeautifulSoup(sauce, 'lxml')
    table1 = soup.find("table", attrs={"class": "substance-properties"})
for row in table1.findAll('tr'):
    row1 = row.findAll('td')
ate = row1[0].getText()
print(ate)

This is my output:

$python3 Date.py
November 11th, 2005

The script should however give me 3 dates. This code works so I know that row[0] does in fact contain a value.I feel like there is some sort of simple formatting error but I am not sure where to begin troubleshooting. When I format it "Correctly" this is the code:

import bs4 as bs
import urllib.request
import pandas as pd
import csv
site = "http://zinc.docking.org/substance/"
for i in range(10, 16): 
    site1 = str("%s%i" % (site, i))
    sauce = urllib.request.urlopen(site1).read()
    soup = bs.BeautifulSoup(sauce, 'lxml')
    table1 = soup.find("table", attrs={"class": "substance-properties"})
    table2 = soup.find("table", attrs={"class": "protomers"})
    for row in table1.findAll('tr'):
        row1 = row.findAll('td')
        ate = row1[0].getText()
        print(ate)

The error I get is as follows:

Traceback (most recent call last):
File "Stack.py", line 11, in <module>
ate = row1[1].getText()
IndexError: list index out of range

The first code works so I know that row[0] does in fact contain a value. Any ideas?

Upvotes: 0

Views: 143

Answers (1)

patrick
patrick

Reputation: 4862

You might want to fix your indentation:

import bs4 as bs
import urllib.request
site = "http://zinc.docking.org/substance/"
for i in range(10, 16): 
    site1 = str("%s%i" % (site, i))
    sauce = urllib.request.urlopen(site1).read()
    soup = bs.BeautifulSoup(sauce, 'lxml')
    table1 = soup.find("table", attrs={"class": "substance-properties"})
    for row in table1.findAll('tr'):
        row1 = row.findAll('td')
        Date = row1[0].getText()
        print(Date)

Edit: You should rename your Date variable, that is a reserved name. Also, by convention Python vars are lower case.

Upvotes: 1

Related Questions