Lola
Lola

Reputation: 97

How to count patterns in each row separately in a dataframe in Python

I am looking at all the possible combinations of 0's and 1's in a sequence of length 4. Thus, I have 2^4 lines in a dataframe or a list. (I don't mind the format as long as each combination can be looked at separately). In these combinations I am searching for particular overlapping patterns.

patterns=["00","101","1100"] 

As a result, in the first combination of "0000" I'd like Python to tell me that it found 3 incidences (I don't care which ones out of the three). I found functions like search() but they only give the overall number of patterns for all the 16 combinations, not each one separately. Plus, I cannot fit into the right data style. Have tried str.count() but again does not seem to work for me even after converting the dataframe into a string.
The best that I could come up with was:

import itertools
sequ=[x for x in itertools.product(states,repeat=n)] #generates all the 
possible seq-s of the variable
from re import finditer
patterns=["00","101","1100"]
for match in finditer(patterns, sequ):
print(match.span())

However this only approximately works for simple patterns, e.g., patterns=["00"]

Upvotes: 0

Views: 230

Answers (1)

ycx
ycx

Reputation: 3211

def main():
    n = int(input("Enter number of digits: "))
    for i in range(0, 1<<n):
        gray=i^(i>>1)
        print ("{:0{}b}".format(gray,n))

main()

#Input: 4
#Output:
#0000
#0001
#0011
#0010
#0110
#0111
#0101
#0100
#1100
#1101
#1111
#1110
#1010
#1011
#1001
#1000

I think this is what you're looking for.
There's no need to use a dataframe for this. It is bit flipping

EDIT:

def graylist(n):
    outlist = []
    for i in range(0, 1<<n):
        gray=i^(i>>1)
#        print ("{:0{}b}".format(gray,n))
        outlist.append('{:0{}b}'.format(gray,n))
    return outlist

alist = graylist(4)

def countingpattern(alist, string):

    count = 0
    for item in alist:
        for i in range(len(item)):
            if item[i:i+len(string)] == string:
                count += 1
    return count

print (countingpattern(alist, '00')) #12
print (countingpattern(alist, '101')) #4
print (countingpattern(alist, '1100')) #1

To see all patterns, we can then put the results in a dictionary.

def countingpatterndict(alist, string):
    adict = {}
    for item in alist:
        count = 0
        for i in range(len(item)):
            if item[i:i+len(string)] == string:
                count += 1
            adict[item] = count
    return adict

print (countingpatterndict(alist, '00')) 
#'0000': 3, '0001': 2, '0011': 1, '0010': 1, ...
print (countingpatterndict(alist, '101'))
#'1110': 0, '1010': 1, '1011': 1, ...
print (countingpatterndict(alist, '111'))
#'1101': 0, '1111': 2, '1110': 1, ...

Further edit:

def graylist(n):
    outlist = []
    for i in range(0, 1<<n):
        gray=i^(i>>1)
        outlist.append('{:0{}b}'.format(gray,n))
    return outlist

def countingpatterndict(alist, string):
    adict = {}
    for item in alist:
        count = 0
        for i in range(len(item)):
            if item[i:i+len(string)] == string:
                count += 1
            adict[item] = count
    return adict

alist = graylist(20)
import time
import pandas as pd
z1 = time.clock()
df = pd.DataFrame.from_dict(countingpatterndict(alist, '101'), orient='index')
z2 = time.clock() - z1
print (z2) #5.716345938402242 seconds
print (df)
df.to_csv('result.csv')

Upvotes: 3

Related Questions