Reputation: 549
i am trying to read a CSV file directly from a website (from a downloadable link) and then fetch one of its column as a list, so that I can further work with it. I am not able to code it properly. The nearest I have reached is
import csv
import urllib.request as urllib
import urllib.request as urlRequest
import urllib.parse as urlParse
url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urlRequest.Request(url, headers = headers)
# open the url
x = urlRequest.urlopen(req)
sourceCode = x.read()
Upvotes: 0
Views: 1522
Reputation: 8059
You are pretty close to the goal.
Just split the read CSV data by lines and pass it to the csv.reader():
import csv
import urllib.request as urllib
import urllib.request as urlRequest
import urllib.parse as urlParse
url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urlRequest.Request(url, headers = headers)
# open the url
x = urlRequest.urlopen(req)
sourceCode = x.read()
cr = csv.DictReader(sourceCode.splitlines())
l = [row['Series'] for row in cr]
But note that x.read()
returns bytearray
, so if csv contains non-ASCII symbols, don't forget to add:
x.read().decode('utf-8') # or another encoding you need
Upvotes: 2