Reputation: 59
I want to remove specific duplicates from a list. With Perl I would do the task with this code:
my @list = ( 'a1', 'a1', 'b1', 'b1' );
my %seen;
@list = grep( !/a\d/ || !$seen{ $_ }++, @list );
and the wanted result would be this:
@list = ( 'a1', 'b1', 'b1' );
How could I do this in Python 3 using regular expression and list comprehension. Thanks.
Upvotes: 3
Views: 161
Reputation: 659
import re
from functools import reduce # this import is not needed in python 2.*
l = ['a1', 'a1', 'b1', 'b1']
print reduce(lambda acc, el: acc if re.match(r'a\d', el) and el in acc else acc + [el], l, [])
Sorry, this is solution without list comprehensions. Is it strictly required?
Upvotes: 1
Reputation:
Here's another solution, using list(set(stuff))
to generate a list
of unique things from stuff
(since set
s automatically deduplicate things):
In [1]: import re
In [2]: l = ["a1", "a1", "b1", "b1"]
In [3]: items_to_dedupe = [x for x in l if re.match(r"a\d", x)]
In [4]: leave_alone = [x for x in l if x not in items_to_dedupe]
In [5]: list(set(items_to_dedupe)) + leave_alone
Out[5]: ['a1', 'b1', 'b1']
Upvotes: 0
Reputation: 107357
You can use itertools.chain
and groupby
:
>>> list(chain(*[[i[0]] if 'a1' in i else i for i in [list(g) for _,g in groupby(sorted(l))]]))
['a1', 'b1', 'b1']
and if you just want to use regex
you can concatenate the elements the n use re.sub
, but note that it works for this special case ! that ,
is the delimiter ! :
>>> l =['a1', 'a1', 'b1', 'b1']
>>> re.sub(r'(a1,)+','a1,',','.join(sorted(l))).split(',')
['a1', 'b1', 'b1']
Upvotes: 1