CEamonn
CEamonn

Reputation: 925

Searching strings in lists of list for substring

I have a list made up of lists of strings:

my_list = [[u'Port    Name    Status'],
           [u'Int1    London    connected'],
           [u'Int2    Paris A    disconnected'],
           [u'Port3    Paris A Backup    disabled']]

I thought about splitting each string by (" ") and creating tuples as I have worked with them before, but as some of the names are separated this doesn't work.

I want to search each string for items in another list and if it exists pop it from my_list, my full code is

my_list = [[u'Port    Name    Status'],
           [u'Int1    London    connected'],
           [u'Int2    Paris A    disconnected'],
           [u'Port3    Paris A Backup    disabled']]

remove_from_list = ['Port', 'connected']

for i in my_list:
    for j in remove_from_list:
        if j in i:
            my_list.pop(i)

Aside from not working I also know my code isn't very pythonic, is there a better way with list comprehension to do this?

I want only exact matches to be removed, so an item with disconnected stays but connected is removed.

Upvotes: 1

Views: 86

Answers (1)

Cleb
Cleb

Reputation: 25997

This list comprehension should do what you want:

[li for li in my_list if not any(wi in li[0].split() for wi in remove_from_list)]

This yields

[[u'Int2    Paris A    disconnected'],
 [u'Port3    Paris A Backup    disabled']]

It uses the idea you mentioned: splitting the entry and then checking whether any of the words is included in the resulting list using any.

Some more explanation:

[li[0].split() for li in my_list]

yields

[[u'Port', u'Name', u'Status'],
 [u'Int1', u'London', u'connected'],
 [u'Int2', u'Paris', u'A', u'disconnected'],
 [u'Port3', u'Paris', u'A', u'Backup', u'disabled']]

That's a new list of lists which we created in a list comprehension; each string is split resulting in a list of single words. Now we have to check whether any of the desired strings are in these sublists which we can do by another list comprehension:

[wi in [u'Port', u'Name', u'Status'] for wi in remove_from_list]
[True, False]

This list comprehension returns a list with Boolean variables stating whether the elements in remove_from_list were found. A quick way to check whether this list contains at least one element equal to True is to use any:

any([True, False])
True

and

any([False, False])
False

Now we can combine these to list comprehensions to expression in the beginning which leaves us with the desired outcome.

One addition, if you want to split based on a regular expression, you can do the following:

import re
[re.split('\s{2,}', li[0]) for li in my_list]

which gives

[[u'Port', u'Name', u'Status'],
 [u'Int1', u'London', u'connected'],
 [u'Int2', u'Paris A', u'disconnected'],
 [u'Port3', u'Paris A Backup', u'disabled']]

Difference to above is that you now always end up with only three strings per sublist as you now split only at at least two white spaces. I think that's what you had in mind originally; in your case it does not make a difference as remove_from_list contains only single words but if the strings in remove_from_list contained whitespaces the first approach would fail: then you should use the one with the regular expression.

The entire command is then

[li for li in my_list if not any(wi in re.split('\s{2,}', li[0]) for wi in remove_from_list)]

also yielding

[[u'Int2    Paris A    disconnected'],
 [u'Port3    Paris A Backup    disabled']]

Upvotes: 4

Related Questions