Ralph Hvald
Ralph Hvald

Reputation: 11

Find all substrings in a string in python 3 with brute-force

I want to find all substrings 'A' to 'B' in L = ['C', 'A', 'B', 'A', 'A', 'X', 'B', 'Y', 'A'] with bruteforce, this is what i've done:

def find_substring(L):
    t = 0
    s = []
    for i in range(len(L) - 1):
        l = []
        if ord(L[i]) == 65:
            for j in range(i, len(L)):
                l.append(L[j])
                if ord(L[j]) == 66:
                    t = t + 1
                    s.append(l)
    return s, t

Now I want the output:

[['A','B'], ['A','B','A','A','X','B'], ['A','A','X','B'], ['A','X','B']]

But i get:

[['A','B','A','A','X','B','Y','A'],['A','B','A','A','X','B','Y','A'],['A','A','X','B','Y','A'],['A','X','B','Y','A']]

Can someone tell me what I'm doing wrong?

Upvotes: 0

Views: 786

Answers (6)

Vasilis G.
Vasilis G.

Reputation: 7844

Another slightly different approach would be this:

L = ['C', 'A', 'B', 'A', 'A', 'X', 'B', 'Y', 'A']

def find_substring(L):
    output = []
    # Start searching for A.
    for i in range(len(L)):
        # If you found one start searching all B's until you reach the end.
        if L[i]=='A':
            for j in range(i,len(L),1):
                # If you found a B, append the sublist from i index to j+1 index (positions of A and B respectively).
                if L[j]=='B':
                    output.append(L[i:j+1])
    return output

result = find_substring(L)
print(result)

Output:

[['A', 'B'], ['A', 'B', 'A', 'A', 'X', 'B'], ['A', 'A', 'X', 'B'], ['A', 'X', 'B']]

In case you need a list comprehension of the above:

def find_substring(L):
    output = [L[i:j+1] for i in range(len(L)) for j in range(i,len(L),1) if L[i]=='A' and L[j]=='B']
    return output

Upvotes: 0

Eventhisone
Eventhisone

Reputation: 86

So, you want all the substrings that start with 'A' and end with 'B'?

When you use @Joeidden's code you can change need the for i in range(len(L) - 1): to for i in range(len(L)): because only strings that end with 'B' will be appended to s.

def find_substring(L):
    s = []
    for i in range(len(L)):
        l = []
        if L[i] == 'A':
            for j in range(i, len(L)):
                l.append(L[j])
                if L[j] == 'B':
                    s.append(l[:])
return s

Upvotes: 0

Serge
Serge

Reputation: 3765

You can reset l to a copy of the string after l is appended l = l[:] right after the last append.

Upvotes: 0

Olivier Melançon
Olivier Melançon

Reputation: 22324

You would be better first finding all indices of 'A' and 'B', then iterating over those, avoiding brute force.

def find_substrings(lst)
    idx_A = [i for i, c in enumerate(lst) if c == 'A']
    idx_B = [i for i, c in enumerate(lst) if c == 'B']

    return [lst[i:j+1] for i in idx_A for j in idx_B if j > i]

Upvotes: 1

tripleee
tripleee

Reputation: 189749

When you append l to s, you are adding a reference to a list which you then continue to grow. You want to append a copy of the l list's contents at the time when you append, to keep it static.

           s.append(l[:])

This is a common FAQ; this question should probably be closed as a duplicate.

Upvotes: 1

Joe Iddon
Joe Iddon

Reputation: 20434

The problem is that the list s, holds references to the l lists.

So even though you are appending the correct l lists to s, they are changed after being appended as the future iterations of the j loop modify the l lists.

You can fix this by appending a copy of the l list: l[:].

Also, you can compare strings directly, no need to convert to ASCII.

def find_substring(L):
    s = []
    for i in range(len(L) - 1):
        l = []
        if L[i] == 'A':
            for j in range(i, len(L)):
                l.append(L[j])
                if L[j] == 'B':
                    s.append(l[:])
    return s

which now works:

>>> find_substring(['C', 'A', 'B', 'A', 'A', 'X', 'B', 'Y', 'A'])
[['A', 'B'], ['A', 'B', 'A', 'A', 'X', 'B'], ['A', 'A', 'X', 'B'], ['A', 'X', 'B']]

Upvotes: 4

Related Questions