Reputation: 895
I have a series of text files that include numerical references. I have word tokenized them and I would like to be able to identify where tokens are numbers and convert them to integer format.
mysent = ['i','am','10','today']
I am unsure how to proceed given the immutability of strings.
Upvotes: 1
Views: 140
Reputation: 2945
If you try to convert a string
that is not a representation of an int
to an int
, you get a ValueError
.
You can try to convert all the elements to int
, and catch ValueError
s:
mysent = ['i','am','10','today']
for i in mysent:
try:
print(int(i))
except ValueError:
continue
OUTPUT:
10
If you want to directly modify the int
inside mysent
, you can use:
mysent = ['i','am','10','today']
for n, i in enumerate(mysent):
try:
mysent[n] = int(i)
except ValueError:
continue
print(mysent)
OUTPUT:
['i', 'am', 10, 'today']
.isdigit() IS NOT THE SAME AS try/except!!!!
In the comments has been pointed out that .isdigit()
may be more elegant and obvious. As stated in the Zen of Python, There should be one-- and preferably only one --obvious way to do it.
From the official documentation, .isdigit()
Return true if all characters in the string are digits and there is at least one character, false otherwise.
Meanwhile, the try/except
block catches the ValueError
raised by applying int
to a non-numerical string
.
They may look similar, but their behavior is really different:
def is_int(n):
try:
int(n)
return True
except ValueError:
return False
EXAMPLES:
Positive integer:
n = "42"
print(is_int(n)) --> True
print(n.isdigit()) --> True
Positive float:
n = "3.14"
print(is_int(n)) --> False
print(n.isdigit()) --> False
Negative integer:
n = "-10"
print(is_int(n)) --> True
print(n.isdigit()) --> False
u
hex:
n = "\u00B23455"
print(is_int(n)) --> False
print(n.isdigit()) --> True
These are only some example, and probably you can already tell which one suits better your needs.
The discussion open around which one should be used is exhausting and neverending, you can have a look a this couple of interesting SO QA:
Upvotes: 1
Reputation: 338
Please try
[item if not item.isdigit() else int(item) for item in mysent]
Upvotes: 3