DonnellyOverflow
DonnellyOverflow

Reputation: 4195

ValueError converting to string after Splitlines()

I'm trying to take some data from an online CSV file and make a table from it. I use splitlines() to isolate each bit of data but I keep getting a ValueError:

ValueError: invalid literal for int() with base 10: 'Year'

Here is my code:

import csv
import urllib.request

url = "https://raw.github.com/datasets/gdp/master/data/gdp.csv"
webpage = urllib.request.urlopen(url)
datareader = csv.reader(webpage.read().decode('utf-8').splitlines())
dataList = []
NewTable = []
print('done')
for row in datareader:
    ##print(row)
    countryName, countryCode, Year, Value= row
    print(Year)
    Year = int(Year)
    ##Value = float(Value)
    rowTuple = countryName, countryCode, Year, Value
    dataList.append(rowTuple)

When I uncomment "print(Year)" I get a list of integers. All numbers between 1960-2012 and I can't figure out why it won't accept the conversion from string to integer.

Any ideas?

Upvotes: 0

Views: 248

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121962

Your first row in the CSV is a header row, not a data row:

Country Name,Country Code,Year,Value

Skip it with:

datareader = csv.reader(webpage.read().decode('utf-8').splitlines())
next(datareader, None)  # skip the header

You could use the io.TextIOWrapper() object to have the webpage decoded from UTF-8 for you:

import io

webpage = urllib.request.urlopen(url)
datareader = csv.reader(io.TextIOWrapper(webpage, 'utf-8'))
next(datareader, None)  # skip the header

Upvotes: 2

Related Questions