JodeCharger100
JodeCharger100

Reputation: 1059

Remove numbers from list but not those in a string

I have a list of list as follows

list_1 = ['what are you 3 guys doing there on 5th avenue', 'my password is 5x35omega44', 
          '2 days ago I saw it', 'every day is a blessing', 
          ' 345000 people have eaten here at the beach']

I want to remove 3, but not 5th or 5x35omega44. All the solutions I have searched for and tried end up removing numbers in an alphanumeric string, but I want those to remain as is. I want my list to look as follows:

list_1 = ['what are you guys doing there on 5th avenue', 'my password is 5x35omega44', 
          'days ago I saw it', 'every day is a blessing', 
          '  people have eaten here at the beach']

I am trying the following:

[' '.join(s for s in words.split() if not any(c.isdigit() for c in s)) for words in list_1]

Upvotes: 1

Views: 80

Answers (5)

JodeCharger100
JodeCharger100

Reputation: 1059

Combining the very helpful regex solutions provided, in a list comprehension format that I wanted, I was able to arrive at the following:

[' '.join([re.sub(r'\b(\d+)\b', '', item) for item in expression.split()]) for expression in list_1]

Upvotes: 0

mkk
mkk

Reputation: 128

It sounds like you should be using regex. This will match numbers separated by word boundaries:

\b(\d+)\b

Here is a working example.

Some Python code may look like this:

import re
for item in list_1:
    new_item = re.sub(r'\b(\d+)\b', ' ', item)
    print(new_item)

I am not sure what the best way to handle spaces would be for your project. You may want to put \s at the end of the expression, making it \b(\d+)\b\s or you may wish to handle this some other way.

Upvotes: 1

MrNobody33
MrNobody33

Reputation: 6483

You can use isinstance(word, int) function and get a shorter way to do it, you could try something like this:

[' '.join([word for word in expression.split() if not isinstance(word, int)]) for expression in list_1]

>>>['what are you guys doing there on 5th avenue', 'my password is 5x35omega44', 
   'days ago I saw it', 'every day is a blessing', 'people have eaten here at the beach']

Upvotes: 0

Ryszard Czech
Ryszard Czech

Reputation: 18611

Use lookarounds to check if digits are not enclosed with letters or digits or underscores:

import re
list_1 = ['what are you 3 guys doing there on 5th avenue', 'my password is 5x35omega44', 
          '2 days ago I saw it', 'every day is a blessing', 
          ' 345000 people have eaten here at the beach']
for l in list_1:
  print(re.sub(r'(?<!\w)\d+(?!\w)', '', l))

Output:

what are you  guys doing there on 5th avenue
my password is 5x35omega44
 days ago I saw it
every day is a blessing
  people have eaten here at the beach

Regex demo

Upvotes: 2

Ezer K
Ezer K

Reputation: 3739

One approach would be to use try and except:

def is_intable(x):
    try:
        int(x)
        return True
    except ValueError:
        return False

[' '.join([word for word in sentence.split() if not is_intable(word)]) for sentence in list_1]

Upvotes: 1

Related Questions