kpz
kpz

Reputation: 209

Trouble with requests' get in Python

I am trying to automatically download files from a website using a list of URLs that I already have. The relevant part of my code looks like this:

for url in urls:
    if len(url) != 0:
        print url

Running this prints a list of urls as strings - as expected. However, when I add one new line as below:

for url in urls:
    if len(url) != 0:
        print url
        r = requests.get(url)

an error appears saying "Invalid URL u'Document Detail': No schema supplied." Before this breaks, it is supposed to print a url. Previously, this printed the url as expected. However, now it prints "Document Detail" instead of a URL. I'm not quite sure why this happens and how to resolve it.

Any help would be appreciated!

EDIT

urls = []
with open('filename.csv', 'rb') as f:
    reader = csv.reader(f)
    count = 0
    for row in reader:
        urls.append(row[34])

Upvotes: 0

Views: 56

Answers (3)

sihrc
sihrc

Reputation: 2828

With reference to my comment, "Document Details" is the header of your csv. Skip it. Here's one way to do it.

urls = []
with open('filename.csv', 'rb') as f:
    read = f.readlines()
    urls = [row.split(",")[34] for row in read[1:]]

Upvotes: 2

valignatev
valignatev

Reputation: 6316

The you should convert url to string explicitly:

for url in urls:
    if len(url) != 0:
        print str(url)
        r = requests.get(str(url))

And maybe you can give us some piece of your .csv file please.

Upvotes: 0

Medhat Gayed
Medhat Gayed

Reputation: 2813

It is possible that the layout of your csv file has changed and the url is no longer in column 33 i.e. (34 - 1 since rows is zero based).

Upvotes: 0

Related Questions