Udit Hari Vashisht
Udit Hari Vashisht

Reputation: 549

Read csv file directly from a website in Python 3

i am trying to read a CSV file directly from a website (from a downloadable link) and then fetch one of its column as a list, so that I can further work with it. I am not able to code it properly. The nearest I have reached is

import csv
import urllib.request as urllib
import urllib.request as urlRequest
import urllib.parse as urlParse

url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urlRequest.Request(url, headers = headers)
# open the url 
x = urlRequest.urlopen(req)
sourceCode = x.read()

Upvotes: 0

Views: 1522

Answers (1)

Oleh Rybalchenko
Oleh Rybalchenko

Reputation: 8059

You are pretty close to the goal.

Just split the read CSV data by lines and pass it to the csv.reader():

import csv
import urllib.request as urllib
import urllib.request as urlRequest
import urllib.parse as urlParse

url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urlRequest.Request(url, headers = headers)
# open the url 
x = urlRequest.urlopen(req)
sourceCode = x.read()

cr = csv.DictReader(sourceCode.splitlines())
l = [row['Series'] for row in cr]

But note that x.read() returns bytearray, so if csv contains non-ASCII symbols, don't forget to add:

 x.read().decode('utf-8') # or another encoding you need

Upvotes: 2

Related Questions