unseen_rider
unseen_rider

Reputation: 324

Python converting strings in a list to numbers

I have encountered the below error message:

invalid literal for int() with base 10: '"2"'

The 2 is enclosed by single quotes on outside, and double quotes on inside. This data is in the primes list from using print primes[0].

Sample data in primes list:

["2","3","5","7"]

The primes list is created from a CSV file via:

primes=csvfile.read().replace('\n',' ').split(',')

I am trying to trying to convert strings in primes list into integers.

Via Google I have come across similar questions to mine on SE, and I have tried the two common answers that are relevant to my problem IMO.

Using map():

primes=map(int,primes)

Using list comprehension:

primes=[int(i) for i in primes]

Unfortunately when I use either of them these both give the same error message as listed above. I get a similar error message for long() when used instead of int().

Please advise.

Upvotes: 3

Views: 402

Answers (3)

Jean-François Fabre
Jean-François Fabre

Reputation: 140168

you want:

  • to read each csv lines
  • to create a single list of integers with the flattened version of all lines.

So you have to deal with the quotes (sometimes they may even not be here depending on how the file is created) and also when you're replacing linefeed by space, that doesn't split the last number from one line with the first number of the next line. You have a lot of issues.

Use csv module instead. Say f is the handle on the opened file then:

import csv

nums = [int(x) for row in csv.reader(f) for x in row]

that parses the cells, strips off the quotes if present and flatten + convert to integer, in one line.

To limit the number of numbers read, you could create a generator comprehension instead of a list comprehension and consume only the n first items:

n = 20000 # number of elements to extract
z = (int(x) for row in csv.reader(f) for x in row)
nums = [next(z) for _ in xrange(n)] # xrange => range for python 3

Even better, to avoid StopIteration exception you could use itertools.islice instead, so if csv data ends, you get the full list:

nums = list(itertools.islice(z,n))

(Note that you have to rewind the file to call this code more than once or you'll get no elements)

Performing this task without the csv module is of course possible ([int(x.strip('"')) for x in csvfile.read().replace('\n',',').split(',')]) but more complex and error-prone.

Upvotes: 3

efirvida
efirvida

Reputation: 4855

try this:

import csv

with open('csv.csv') as csvfile:
    data = csv.reader(csvfile, delimiter=',', skipinitialspace=True)
    primes = [int(j) for i in data for j in i]
    print primes

or to avoid duplicates

    print set(primes)

Upvotes: 0

Ajax1234
Ajax1234

Reputation: 71451

You can try this:

primes=csvfile.read().replace('\n',' ').split(',')
final_primes = [int(i[1:-1]) for i in primes]

Upvotes: 0

Related Questions