Reputation: 4195
I'm trying to take some data from an online CSV file and make a table from it. I use splitlines() to isolate each bit of data but I keep getting a ValueError:
ValueError: invalid literal for int() with base 10: 'Year'
Here is my code:
import csv
import urllib.request
url = "https://raw.github.com/datasets/gdp/master/data/gdp.csv"
webpage = urllib.request.urlopen(url)
datareader = csv.reader(webpage.read().decode('utf-8').splitlines())
dataList = []
NewTable = []
print('done')
for row in datareader:
##print(row)
countryName, countryCode, Year, Value= row
print(Year)
Year = int(Year)
##Value = float(Value)
rowTuple = countryName, countryCode, Year, Value
dataList.append(rowTuple)
When I uncomment "print(Year)" I get a list of integers. All numbers between 1960-2012 and I can't figure out why it won't accept the conversion from string to integer.
Any ideas?
Upvotes: 0
Views: 248
Reputation: 1121962
Your first row in the CSV is a header row, not a data row:
Country Name,Country Code,Year,Value
Skip it with:
datareader = csv.reader(webpage.read().decode('utf-8').splitlines())
next(datareader, None) # skip the header
You could use the io.TextIOWrapper()
object to have the webpage decoded from UTF-8 for you:
import io
webpage = urllib.request.urlopen(url)
datareader = csv.reader(io.TextIOWrapper(webpage, 'utf-8'))
next(datareader, None) # skip the header
Upvotes: 2