Dinesh Kumar Yadav
Dinesh Kumar Yadav

Reputation: 25

algorithm- for counting pairwise mutual occurrences

I have a list of lists. L1= [[...][...][.....].....] If I take all the elements after flattening the list and extract unique values from it then i get a list L2. I have another list L3 which is some subset of L2.

I want to find the pair-wise mutual occurrences of the elements of L3 in L1. Relation is non-directed. i.e. a,b is same as b,a

eg- L1= [[a b c d][a b d g f][c d g][d g]....] L2=[a b c d g f] say L3 = [c d g]

I want to find pair wise mutual occurrences of L3 in L1. i.e these values. c,d:2 d,g:3 c,g:1

I'm getting O(n*n*m*p); where- p- no. of lists in L1, m - avg. no. of elements in each list of L1. n - no. of elements in L3.

Can I get an improved complexity?

Code for above in python is:

Here sig_tags is L3 and tags is L1.

x=[]
    for i in range(len(sig_tags)):
        for j in range(i+1,len(sig_tags)):
            count=0
            for k in tags:
                if (sig_tags[i] in k) and (sig_tags[j] in k):
                    count+=1
            if count>param:        
                x.append([sig_tags[i],sig_tags[j],count])
    return x

Upvotes: 0

Views: 58

Answers (1)

Sorin
Sorin

Reputation: 11968

Yes you can.

Give each element an id, then convert the list L1 into a list of bit vectors, where a bit is true if that list constains the corresponding letter. This is O(m*p), or O(M*p*log|Alphabet|) depending how you implement it.

Now to check if a pair belongs to a list you need to check if cerain 2 bits are true which is O(1). So all the checks are going to be O(n^2*p).

Overall the complexity is O(n^2*p + m*p).

You can skip assiging ids if you use a hash function. Be careful, sometimes the hash function computation is expensive.

Upvotes: 1

Related Questions