Reputation: 93

how to identify that anystring of list contains digits or not in python

I want to identify whether any string of the list contain number/ digit at any position, and if so then code should remove that digit from the string by use of python. My code is

pattern = '\w+-\w+[-\w+]*|-';
pattern2 = '\d'
contents = ["babies","walked","boys","walking", "CD28", "IL-2", "honour"];
for token in contents:
    if token.endswith("ies"):
        f.write(string.replace(token,'ies','y',1))
    elif token.endswith('s'):
        f.write(token[0:-1])
    elif token.endswith("ed"):
        f.write(token[0:-2])
    elif token.endswith("ing"):
        f.write(token[0:-3])
    elif re.match(pattern,token):
        f.write(string.replace(token,'-',""))
    elif re.match(pattern2,token):
        f.write(token.translate(None,"0123456789"))
    else:
       f.write(t)
f.close()

actually the problem is in re.match(patter2,token). It does not identify a digit in token but f.write(token.translate(None,"0123456789")) worked well when I used it alone.

Upvotes: 1

Answers (3)

Shaheen Gul

Reputation: 93

import nltk;
import string;
import re;
f=open("stemming.txt",'w')
contents=file.read();
pattern = '\w+-\w+[-\w+]*|-';
digits = re.compile('\d')
contents = ["babies","walked","boys","walking", "CD28", "IL-2", "honour"];
for token in contents:
    if token.endswith("ies"):
        f.write(string.replace(token,'ies','y',1))
    elif token.endswith('s'):
        f.write(token[0:-1])
    elif token.endswith("ed"):
        f.write(token[0:-2])
    elif token.endswith("ing"):
        f.write(token[0:-3])
    elif re.match(pattern,token):
        f.write(string.replace(token,'-',""))
    elif bool(digits.search(token)):
        f.write(token.translate(None,"0123456789"))
    else:
        f.write(t)
f.close()

Upvotes: 0

Padraic Cunningham

Reputation: 180441

If you want to remove digits use str.translate:

contents = ["IL-2", "CD-28","IL2","25"];

print([s.translate(None,"0123456789") for s in contents])
['IL-', 'CD-', 'IL', '']

If you only want to remove the digits if the string contains a mixture:

print([s.translate(None,"0123456789") if not s.isdigit() else s for s in contents])
  ['IL-', 'CD-', 'IL', '25']

If the digits are always at the end you can use rstrip:

print([s.rstrip("0123456789") for s in contents])

For python 3 you need to create a table using str.maketrans:

tbl = str.maketrans({k:"" for k in dig})


print([s.translate(tbl) for s in contents])
['IL-', 'CD-', 'IL', '']

Upvotes: 5

Kasravnd

Reputation: 107297

You can just use re.sub within a list comprehension :

>>> contents = ["IL-2", "CD-28","IL2","25"]
>>> import re
>>> [re.sub(r'\d','',i) for i in contents]
['IL-', 'CD-', 'IL', '']

But as a better solution for such task you can use str.translate method!

>>> from string import digits
>>> [i.translate(None,digits) for i in contents]
['IL-', 'CD-', 'IL', '']

And if you are in python 3 :

>>> trans_table = dict.fromkeys(map(ord,digits), None)
>>> [i.translate(trans_table) for i in contents]
['IL-', 'CD-', 'IL', '']

Upvotes: 6

how to identify that anystring of list contains digits or not in python

Answers (3)

Related Questions