Rex
Rex

Reputation: 2147

How to remove elements of a list containing specific patten with python?

Given a list of strings

listA=['a', 'b', 'a@b=c', 'a @ b = c', 'a@ =b', 'a@=b' 'a=b@c', 'a@b' ]
                   ^         ^

we want elements marked in "^" removed, and get

ListB=['a', 'b', 'a@ =b', 'a@=b' 'a=b@c', 'a@b']

Here we removed all elements containing '@', followed by some character, then followed by '=', but ignored "@=" or '@ ='

How to do such regex matching for a python list?

EDIT:

I know if we know specific index of a elements to delete, we can use numpy.delete(list, index) to delete it. But it is not true in this case.

Upvotes: 1

Views: 3088

Answers (3)

Robin James Kerrison
Robin James Kerrison

Reputation: 1757

Regex searches in python can be done with the re module; specifically, re.search('@\w=', my_string) will not be None if my_string contains an @ and a = separated by a member of \w, i.e. a word character (alphanumerics and _).

I've expanded this to include cases where there's whitespace too, using \s.

import re

listA = ['a', 'b', 'a@b=c', 'a @ b = c', 'a@ =b', 'a@=b' 'a=b@c', 'a@b' ]
listB = [a for a in listA if not re.search('@\s*\w+\s*=', a)]

Update: solution above now using \w+to match multiple word characters instead of just one.

Upvotes: 3

Cody Bouche
Cody Bouche

Reputation: 955

import re
listA = ['a', 'b', 'a@b=c', 'a @ b = c', 'a@ =b', 'a@=b' 'a=b@c', 'a@b' ]
print [x for x in listA if not re.search(r'@\s*[a-zA-Z]\s*=', x)]

Upvotes: 0

Sam
Sam

Reputation: 20486

Using the expression @\s*\S\s*= and re.search() we can filter this list down:

import re
listB = [str for str in listA if re.search(r'@\s*\S\s*=', str) is None]

print listB
# ['a', 'b', 'a@ =b', 'a@=ba=b@c', 'a@b']

Upvotes: 3

Related Questions