user2471076
user2471076

Reputation: 69

how to convert string like "5cm" into an integer

I have an input list like [2,3,4,"5cm", 6,"2.5km"] and I would like to have a result:

[2,3,4,5,6,2.5]

I would like to start in this way

for element in inputList:

Upvotes: 1

Views: 288

Answers (6)

dansalmo
dansalmo

Reputation: 11694

Here is a solution inspired by @Akavall and simplified with ast.literal_eval:

from ast import literal_eval
def get_digits(s):
    return ''.join(ele for ele in s if not ele.isalpha())

def convert_to_nums(my_list):
    return [literal_eval(d) for d in (get_digits(s) for s in map(str, my_list))]

Result:

>>> my_list = [2,3,4,"5cm", 6,"2.5km"]
>>> convert_to_nums(my_list)
[2, 3, 4, 5, 6, 2.5]

Upvotes: 0

FMc
FMc

Reputation: 42421

First, use a regular expression: it's the right tool for the job. Second, use the simplest solution that will work for your known requirements: specifically, a regular expression that we can use to remove non-digits from the end of the string.

import re

vals = [2, 3, 4, "5cm", 6, "2.5km"]

rgx  = re.compile(r'\D+$')
nums = [float( rgx.sub('', str(v)) ) for v in vals]

print nums

And if you really must shun regular expressions, here's a way to do it without resorting to exception handling, type checking, or any logic more complex than the simplest if-else.

def leading_digits(v):
    for c in str(v):
        if c in '0123456789.': yield c
        else:                  return

def intfloat(s):
    f = float(s)
    i = int(f)
    return i if i == f else f

vals = [2, 3, 4, "5cm", 6, "2.5km", '8.77cm extra junk w/ digits 44']
nums = [intfloat(''.join(leading_digits(v))) for v in vals]

print nums   # [2, 3, 4, 5, 6, 2.5, 8.77]

Upvotes: 1

Velimir Mlaker
Velimir Mlaker

Reputation: 10975

Here's one more (probably least elegant), if you can't stand regular expressions:

input = [2,3,4,"5cm", 6,"2.5km"]
result = list()
for ele in input:
    while type(ele) is str:
        ele = ele[:-1]  # Strip off one letter from the end.
        for tt in (int, float):
            try: 
                ele = tt(ele)
                break
            except:
                pass
    result.append(ele)

print result  

Upvotes: 0

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 251116

You can use regex:

>>> import re
>>> lis = [2,3,4,"5cm", 6,"2.5km"]
>>> r = re.compile(r'\d+(.\d+)?')
>>> [float(r.search(x).group(0)) if isinstance(x,str) else x  for x in lis]
[2, 3, 4, 5.0, 6, 2.5]

Use ast.literal_eval instead of float to get 5.0 as 5:

>>> from ast import literal_eval
>>> [literal_eval(r.search(x).group(0)) if isinstance(x,str) else x  for x in lis]
[2, 3, 4, 5, 6, 2.5]

Starting your way:

import re
from ast import literal_eval
ans = []
r = re.compile(r'\d+(.\d+)?')            #regex to match an integer or decimal 
inputList = [2,3,4,"5cm", 6,"2.5km"]
for element in inputList:
   if isinstance(element, str):          #if element is a string then apply the regex
       num = r.search(element).group(0)  
       ans.append(literal_eval(num))
   else:
       ans.append(element)               #else append the element as it is
print ans
#[2, 3, 4, 5, 6, 2.5]

Another solution, considering your inputs are always valid ones:

>>> from string import digits
>>> allowed = '-+.' + digits
>>> allowed                        #allowed characters
'-+.0123456789'
>>> lis = [2,3,4,"5cm", 6,"2.5km"]
>>> ans = []
for item in lis:
    if isinstance(item, str):
    # if item is a string
        num = ''               # Initialize an empty string
        for c in item:         # Iterate over the string, one character at time.
            if c in allowed:   # If the character is present in `allowed` then
                 num += c      # concatenate it to num
            else:
                break          # else break out of loop
        ans.append(float(num)) # Append the float() output of `num` to `ans` or use 
                               # `ast.literal_eval`
    else:
        ans.append(item)
...         
>>> ans
[2, 3, 4, 5.0, 6, 2.5]

Upvotes: 5

import re

inputList = [2, 3, 5, "2", "2.5km", "3cm"]
outputList = []
for element in [str(i) for i in inputList]:
    match = re.match(r"([-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?).*", element)
    if match:
        outputList.append(float(match.group(1)))

print outputList

This solution uses regular expressions to extract the numeric part from a string. re is an extremely useful module with which you should definetely make yourself aquainted.

Because regular expressions only work on strings, we first have to convert those list elements that are numbers to strings. We do this, using a list comprehension: [str(i) for i in inputList]

If you write print [str(i) for i in inputList], then you'll get:

["2", "3", "5", "2", "2.5km", "3cm"]

So it's almost the same list as it was before, but the numbers are now strings. Now, using this we can create a regular expression, that recognizes numbers. I didn't make that one up myself, it's from here (%f). We match each element from the stringified list to that pattern and convert the resulting string to a float which we append to the outputList.

Note that in some locales, the decimal point (\.) may be represented by a different character. If this is important in your situation, you can receive the current locales decimal point character as follows:

import locale
locale.localeconv()["decimal_point"]

I hope the explanation makes it a bit clearer to you, what's going on - if not, please comment below.

Upvotes: 3

Akavall
Akavall

Reputation: 86286

Here is a solution that does not use regex: :

my_list = [2,3,4,"5cm", 6,"2.5km"]

def get_digits(s):
    return ''.join(ele for ele in s if not ele.isalpha())


def convert_to_nums(my_list):
    result = []
    for ele in my_list:
        if isinstance(ele, (int, float)):
            result.append(ele)
        else:
            ele = get_digits(ele)
            try:
                result.append(int(ele))
            except ValueError:
                result.append(float(ele))
    return result

Result:

>>> convert_to_nums(my_list)
[2, 3, 4, 5, 6, 2.5]

Upvotes: 1

Related Questions