warship
warship

Reputation: 3024

How can I customize map() for a list of strings in Python?

How do I tell map() to selectively convert only some of the strings (not all the strings) within a list to integer values?

Input file (tab-delimited):

abc1    34    56
abc1    78    90  

My attempt:

import csv

with open('file.txt') as f:
    start = csv.reader(f, delimiter='\t')
    for row in start:
        X = map(int, row)
        print X

Error message: ValueError: invalid literal for int() with base 10: 'abc1'

When I read in the file with the csv module, it is a list of strings:

['abc1', '34', '56']
['abc1', '78', '90']

map() obviously does not like 'abc1'even though it is a string just like '34' is a string.

I thoroughly examined Convert string to integer using map() but it did not help me deal with the first column of my input file.

Upvotes: 1

Views: 168

Answers (3)

Joran Beasley
Joran Beasley

Reputation: 114038

def safeint(val):
   try:
      return int(val)
   except ValueError:
      return val

for row in start:
    X = map(safeint, row)
    print X

is one way to do it ... you can step it up even more

from functools import partial
myMapper = partial(map,safeint)
map(myMapper,start)

Upvotes: 3

abarnert
abarnert

Reputation: 365925

I like Roberto Bonvallet's answer, but if you want to do things immutably, as you're doing in your question, you can:

import csv

with open('file.txt') as f:
    start = csv.reader(f, delimiter='\t')
    for row in start:
        X = [row[0]] + map(int, row[1:])
        print X

… or…

numeric_cols = (1, 2)

X = [int(value) if col in numeric_cols else value 
     for col, value in enumerate(row])

… or, probably most readably, wrap that up in a map_partial function, so you can do this:

X = map_partial(int, (1, 2), row)

You could implement it as:

def map_partial(func, indices, iterable):
    return [func(value) if i in indices else value 
            for i, value in enumerate(iterable)]

If you want to be able to access all of the rows after you're done, you can't just print each one, you have to store it in some kind of structure. What structure you want depends on how you want to refer to these rows later.

For example, maybe you just want a list of rows:

rows = []
with open('file.txt') as f:
    for row in csv.reader(f, delimiter='\t'):
        rows.append(map_partial(int, (1, 2), row))
print('The second column of the first row is {}'.format(rows[0][1]))

Or maybe you want to be able to look them up by the string ID in the first column, rather than by index. Since those IDs aren't unique, each ID will map to a list of rows:

rows = {}
with open('file.txt') as f:
    for row in csv.reader(f, delimiter='\t'):
        rows.setdefault(row[0], []).append(map_partial(int, (1, 2), row))
print('The second column of the first abc1 row is {}'.format(rows['abc1'][0][1]))

Upvotes: 1

Roberto Bonvallet
Roberto Bonvallet

Reputation: 33359

Map only the part of the list that interests you:

row[1:] = map(int, row[1:])
print row

Here, row[1:] is a slice of the list that starts at the second element (the one with index 1) up to the end of the list.

Upvotes: 2

Related Questions