Reputation: 1809
I'm using Python 2.7, and I have urllib3. I'm trying to download each of the .txt files in this link: http://web.mta.info/developers/turnstile.html
Here's my code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import requests
import urllib3, shutil
http = urllib3.PoolManager()
MTA_url = requests.get("http://web.mta.info/developers/turnstile.html").text
MTA_soup = BeautifulSoup(MTA_url)
#Find each link to be downloaded
MTA_soup.findAll('a')
#Let's test it with the 36th link
one_a_tag = MTA_soup.findAll("a")[36]
MTA_link = one_a_tag["href"]
download_url = 'http://web.mta.info/developers/'+ MTA_link
print download_url #valid url, will take you to download
This is where I'm stuck. I can't seem to figure out how to download the .txt file at download_url
, let alone iterate through the list. I've tried this:
open('/Users/me/Documents/test_output_download.csv', 'wb').write(download_url.content)
But that gives me the error:
AttributeError: 'unicode' object has no attribute 'content'
After reading further, I also tried:
out_file = '/Users/me/Documents/test_output_download.csv'
http.request('GET', download_url, preload_content=False) as res, open(out_file, 'wb') as out_file:
shutil.copyfileobj(res, out_file)
But I get past this syntax error:
http.request('GET', download_url, preload_content=False) as res, open(out_file, 'wb') as out_file:
^
SyntaxError: invalid syntax
How can I just download the .txt file that is located at download_url
and save it to my local drive, using urllib3?
Upvotes: 1
Views: 236
Reputation: 42
The 'as' keyword is used for imports. I tested the full segment of code and was able to download after making a small change here.
Try shifting this around to declare the objects to variables instead, like so:
res = http.request('GET', download_url, preload_content=False)
out_file = open(out_file, 'wb')
shutil.copyfileobj(res, out_file)
Upvotes: 2