user2071737
user2071737

Reputation: 53

Find Pattern in Python List

I want to loop through a python list and group patterns of data.

The list is, in fact, a list of dicts with various properties which can be divided into 3 types. I will call these As, Bs and Cs.

The pattern I am looking for is each A type dict with the previous C dict plus the previous two B dicts. Each A and B dict should only exist in one group.

Example:

Original List (data): [A1, B1, B2, B3, C1, A2, A3, B4, B5, B6, B7, C2, B8, C3, A4]

Desired Result: [[B2,B3,C1,A2], [B7,B8,C3,A4]]

Conditions:

As you can see from the example an A should be ignored if there are no previous B and C's (e.g. A1) or if there is another A before these B and C's (e.g. A3). Also there may be rogue Cs that can also be ignored (e.g. C2).

What I have Tried:

# Extract indices for all A elements
As = [i for i, item in enumerate(data) if item['Class']=="A"]

 # Loop through the A's
for a in As:

    # Ensure the A isn't too close to the start of the list to have sufficient prev elements
    if a > 2:

        e = [data[a]]

        # For each prev item
        for index in range (a-1,0,-1):

            # Get the item
            item = data[index]            

            if (len(e) > 3) :
                continue #Exit once there are 4 items in the list 
            elif (len(e) > 1) :
                searching = "B"; # Start by seraching for B's
            else:
                searching = "C"; # After a B is found go to C's

            if item['Class']=="A": # If another A is found before the list is filled end the search
                break
            elif item['Class']==searching:
                e.append(item)


        if data[index]['Class']=="A":
            continue        

This works but feels like really terrible code! Any better solution suggestions would be appreciated.

Upvotes: 1

Views: 7571

Answers (1)

Sergei
Sergei

Reputation: 470

I'd use Regex in your case

import re

# Convert to Class string representation
# example 'ABBBCAABBBBCBCA' 
string_repr = ''.join([item['Class'] for item in data])

# Compile pattern we are looking for
pattern = re.compile(r'BC*B+C+B*A')

# Find last possition of patterns in string_repr
positions = [match.end() - 1 for match in re.finditer(pattern, string_repr)]

# Use indices from positions on data list. Thay match
your_As = [data[i] for i in positions]

Upvotes: 1

Related Questions