Reputation: 3511
I have the following csv file:
hindex
1
2
2
6
3
3
3
2
2
I am trying to read the row and check its value but it gives the following error:
ValueError: invalid literal for int() with base 10: 'hindex'
The code is:
cr = csv.reader(open('C:\\Users\\chatterjees\\Desktop\\data\\topic_hindex.csv', "rb"))
for row in cr:
x=row[0]
if(int(x)<=10):
print x
what's wrong in my code?
Upvotes: 0
Views: 745
Reputation: 4592
Just one more alternative here. I wrote a wrapper library which could handle this task at ease too. Suppose you have saved the data in a file named "topic_hindex.csv" in the directory where the following script is.
import pyexcel
r = pyexcel.SeriesReader("topic_hindex.csv")
for row in r.rows():
x = row[0]
if x <= 10:
print x
Or alternatively, you can use a filter:
import pyexcel
r = pyexcel.SeriesReader("topic_hindex.csv")
eval_func = lambda row: row[0] <= 10
r.filter(pyexcel.RowValueFilter(eval_func))
for row in r.rows():
print row[0]
Upvotes: 1
Reputation: 4448
The first row cannot be transform into an integer. You can skip all the rows like the first one by using a try except
block:
cr = csv.reader(open('C:\\Users\\chatterjees\\Desktop\\data\\topic_hindex.csv', "rb"))
for row in cr:
x=row[0]
try:
if int(x) <= 10:
print x
except ValueError:
pass
Upvotes: 2
Reputation: 3346
Here's a solution that skips the first and first row only and fails with ValueError
in case any other row contains a non numeric value. It does so by using the built-in enumerate()
function which keeps count of the number of rows processed. Furthermore it properly closes the input file when it's done using the with
statement.
import csv
with open('C:\\Users\\chatterjees\\Desktop\\data\\topic_hindex.csv', 'rb') as csvFile:
for rowNumber, row in enumerate(csv.reader(csvFile)):
if rowNumber > 0:
x = row[0]
if int(x) <= 10:
print x
Upvotes: 1
Reputation: 179717
Rather surprising nobody mentioned csv.DictReader
, since it's really the simplest way to skip the header row and get the data in a nice dictionary format:
import csv
with open('C:\\Users\\chatterjees\\Desktop\\data\\topic_hindex.csv', "rb") as f:
cr = csv.DictReader(f)
for row in cr:
x = row['hindex']
if int(x) <= 10:
print x
Upvotes: 2
Reputation: 12178
Your first line in the .csv contains something which cannot be converted to an int, so
if(int(x)<=10):
fails with a ValueError. (there is absolutely no need to enclose the expression in (), btw.)
You can eighter skip the first line of the .csv, or wrap int(x)
into a try/catch block, like so:
for row in cr:
x=row[0]
try:
x=int(x)
except ValueError: # x cannot be converted to int
continue # so we skip this row
if x<=10: # no need for parens here
print x
Learn more about Exceptions and handling those here: http://docs.python.org/tutorial/errors.html
Upvotes: 4
Reputation: 10937
The code tries to process every line in your file, including hindex
. You are trying to convert this string to an int which throws the ValueError
:
To skip the first line (which contains the headers) try:
cr = csv.reader(open('C:\\Users\\chatterjees\\Desktop\\data\\topic_hindex.csv', "rb"))
for row in cr[1:]:
x=row[0]
if(int(x)<=10):
print x
Upvotes: 4
Reputation: 26160
You need to skip row 1. It is trying to parse your column header from the file in to an int, but since it is a char string, it is choking and dying.
Upvotes: 4