Reputation: 485

Checking for pattern in string Python

Suppose we have strings of the type:

test= '--a-kbb-:xx---xtx:-----x--:---g-x--:-----x--:------X-:XXn-tt-X:l--f--O-'

that is, they are always composed of 8 sections separated by :, so one could split the string into a list with each element corresponding to a section:

testsep = test.split(':')

giving

['--a-kbb-', 'xx---xtx', '-----x--', '---g-x--', '-----x--', '------X-', 'XXn-tt-X', 'l--f--O-']

Now I want to check if the string test is such that there are in 3 consecutive sections an x occurring at the same position of the section. For example, with the test given above, we find at least one such case: counting from 1, sections 2,3 and 4 contain an x at the same position, namely at index 6. Therefore, our test string here matches the wanted pattern.

Is there a simple (maybe functional way) of checking for such patterns given strings always composed with the formatting above?

The naive approach would be to split, then loop through all sections and see if there are consecutive sections having x at each possible position (first index 1, 2, ...up to 8), but that wouldn't be very python-like.

Upvotes: 1

Answers (3)

Don

Reputation: 17606

Pick every 9th element and check if there are 3 consecutive 'x's:

test= '--a-kbb-:xx---xtx:-----x--:---g-x--:-----x--:------X-:XXn-tt-X:l--f--O-'
for i in range(9):
    if 'xxx' in test[i::9]:
        print("Pattern matched at position %d" % i)
        break
else:
    print("Pattern not matched")

gives

Pattern matched at position 5

Short version:

>>> any(('xxx' in test[i::9] for i in range(9)))
True

Upvotes: 2

Mikhail Vladimirov

Reputation: 13890

Is this pythonish enough?

str = '--a-kbb-:xx---xtx:-----x--:---g-x--:-----x--:------X-:XXn-tt-X:l--f--O-'
sections = str.split (':')
reduce (lambda a, b: a | ('xxx' in b), [reduce(lambda c, d: c + d, map(lambda c: c[i], sections), '') for i in range(reduce (lambda e, f: max (e, len (f)), sections, 0))], False)

Explanation

reduce (lambda e, f: max (e, len (f)), sections, 0)

calculates the maximum section length;

for i in range(reduce (lambda e, f: max (e, len (f)), sections, 0))

iterates i from zero to maximum section length minus 1;

map(lambda c: c[i], sections)

calculates list of i-th characters of all sections;

reduce(lambda c, d: c + d, map(lambda c: c[i], sections), '')

calculates string consisting of i-th characters of all sections;

[reduce(lambda c, d: c + d, map(lambda c: c[i], sections), '') for i in range(reduce (lambda e, f: max (e, len (f)), sections, 0))]

calculates list of strings, where i-th string consists of i-th characters of all sections;

and final expression returns True in case any of the strings in the list calculated at previous step contains three consecutive 'x's.

Upvotes: 2

Ajax1234

Reputation: 71451

A possibility is to use itertools.groupby with a class to group runs of strings that all have an x at the same position:

from itertools import groupby
class X:
  def __init__(self, _x):
    self.x = _x
  def __eq__(self, _val):
    return any(a == 'x' and b =='x' for a, b in zip(self.x, _val.x))

d = ['--a-kbb-', 'xx---xtx', '-----x--', '---g-x--', '-----x--', '------X-', 'XXn-tt-X', 'l--f--O-']
result = [[a, [i.x for i in b]] for a, b in groupby(list(map(X, d)))]
final_result = [b for _, b in result if any(all(h == 'x' for h in c) for c in zip(*b))]

Output:

[['xx---xtx', '-----x--', '---g-x--', '-----x--']]

However, it is much simpler to use the naive approach and indeed, the solution is quite Pythonic:

def group(d):
  start = [d[0]]
  for i in d[1:]:
    if any(all('x' == c for c in b) for b in zip(*(start+[i]))):
       start.append(i)
    else:
       if len(start) > 1:
         yield start
       start = [i]

print(list(group(d)))

Output:

[['xx---xtx', '-----x--', '---g-x--', '-----x--']]

Upvotes: 2

Checking for pattern in string Python

Answers (3)

Related Questions