Reputation: 311
list A: ['abc.txt', '123.txt', 'apple.jpg']
list B: ['ab', '123']
I want to generate a new list A that only contains the ones not in list B with wildcard match. The idea output will be:
list C: ['apple.jpg']
Here is my code:
lista=['abc.txt', 'happy.txt', 'apple.jpg']
listb=['happy', 'ab']
listc=lista
for a in lista:
for b in listb:
print(a + ": " + b)
if b in a:
listc.remove(a)
print(listc)
The output of my code is:
abc.txt: happy
abc.txt: ab
apple.jpg: happy
apple.jpg: ab
['happy.txt', 'apple.jpg']
Anyone know where it went wrong? And, any better way to do it? Tks.
Upvotes: 0
Views: 253
Reputation: 612
The problem is here :
listc = lista
You're copying the reference, not the content : so listc is lista. When you remove element from lista, listc is going to lose this element, too.
If you want to copy the content of the lista in listc, you need to use :
import copy
listc = copy.copy(lista)
You can get more informations here : How to clone or copy a list?
Upvotes: 0
Reputation: 580
python as default copy list by reference. you need to make a deep copy from lista to listc. copy library can help you. modify your code like this:
import copy
lista=['abc.txt', 'happy.txt', 'apple.jpg']
listb=['happy', 'ab']
listc=copy.deepcopy(lista)
for a in lista:
for b in listb:
if b in a:
listc.remove(a)
print(listc)
Upvotes: 0
Reputation: 22776
You could use this list comprehension which filters the elements that don't exist in B (they aren't in any one of B's elements, and all B's elements aren't in them):
lista = ['abc.txt', '123.txt', 'apple.jpg']
listb = ['ab', '123']
listc = [a for a in lista if all(a not in b and b not in a for b in listb)]
print(listc) # => ['apple.jpg']
Upvotes: 0
Reputation: 57033
After the assignment listc=lista
both variables refer to the same list. As a result, you modify the list through which you iterate, which causes the undesirable side effects. You should make a copy of the original list: listc=lista.copy()
.
Here's a better, regex-based solution to your problem:
import re
pattern = re.compile('|'.join(listb)) # Anything ON the listb
# re.compile('happy|ab')
listc = [a for a in lista if not pattern.match(a)]
# ['apple.jpg']
Upvotes: 1