Gravity M
Gravity M

Reputation: 1517

How to find reoccuring string groups (order does not matter) in a dictionary of lists?

There are fixed number of strings that are repeated in the dictionary lists: String items -- Ruth, James, Sandy, Daniel, Sarah, Tommy, Alex, Rob, Teddy, Steve, Mark. Here n = 11.

My dictionary looks like the following (number after the key is rank of groups):

DictionaryLists = {
                   1: {1 : ['Teddy', 'Daniel', 'Alex']},
                   2: {2 : ['Rob', 'Steve', 'Mark', 'Sandy']},
                   3: {5 : ['Ruth', 'Sarah', 'James']},
                   4: {1 : ['Teddy', 'Alex', 'Steve', 'Sandy', 'Daniel']},
                   5: {2 : ['Mark', 'Sarah', 'Rob']},
                   6: {1 : ['Teddy', 'Daniel', 'Alex']},
                   7: {2 : ['Mark', 'Sandy']}
                  }

Task 1: What I want to find: I want to find out the frequency of combination of string items list (order does not matter). Let's say if I pick a combination of Teddy, Daniel, Alex (order does not matter) then I want to see the frequency of this group when rank = 1. Then I want to see what are the strings that are in rank = 2 WHEN Teddy, Daniel, Alex is in rank = 1. For example, Rob is in the rank 2 when those three names are in rank 1. Teddy, Alex, Daniel group appears 3 times in rank 1. Rob appears two times. So the result should be,

Teddy,Daniel,Alex,3,Rob,2

It means Teddy,Daniel,Alex appears three times in rank = 1: and Rob appears 2 times in rank = 2 when those three names are in rank = 1.

My try: I created a list with all the lists in the dictionary. TotalLists = [OneGroupRank1, OneGroupRank2, OneGroupRank3, TwoGroupRank1, TwoGroupRank2, ThreeGroupRank1, ThreeGroupRank2].

n=3
NewDict = {}
TotalListStrings = tuple(TotalLists[0][0:n]) #3 strings are taken from the list 0 
NewDict = {TotalListStrings : 1}
for IndividualLists in lists[1:]:
    if all(item in IndividualList for item in TotalListStrings):
        NewDict[TotalListStrings]+=1
print(NewDict)

This basically takes three strings from the 0th list and finds patterns. This produced the result of {(String1, String2, String3):9} --> 9 is the number of occurrences.

My questions are:

(1) How can I check patterns in the dictionary when there are conditions to check (checking key values and then proceeding) for each group?

(2) How can I create a combination of all names (n=11) as a group of 3 and check for patterns? Instead of relying on the lists, I simply want to create combination of three names. For example, Teddy, Daniel, Alex another one would be Teddy, Rob, Steve, etc.

(3) How can the Task 1 mentioned above can be accomplished?

Upvotes: 1

Views: 111

Answers (1)

jonrsharpe
jonrsharpe

Reputation: 122032

Assuming (what I think is) a more sensible input format:

data = {'A': {1: ['Teddy', 'Daniel', 'Alex'],
              2: ['Rob', 'Steve', 'Mark', 'Sandy'],
              3: ['Ruth', 'Sarah', 'James']},
        'B': {1: ['Teddy', 'Alex', 'Steve', 'Sandy', 'Daniel'],
              2: ['Mark', 'Sarah', 'Rob']},
        'C': {1: ['Teddy', 'Daniel', 'Alex'], 
              2: ['Mark', 'Sandy']}}

You could do this like:

from collections import Counter

output = {1: 0, 2: Counter()}

names = frozenset(['Teddy', 'Daniel', 'Alex'])

for dataset in data.values():
    if names.issubset(dataset[1]):
        output[1] += 1
        output[2].update(dataset[2])

Which would give the output:

>>> output
{1: 3, 
 2: Counter({'Mark': 3, 
             'Rob': 2, 
             'Sandy': 2, 
             'Sarah': 1, 
             'Steve': 1})}

By making names a frozenset, it can also be used as a dictionary key, so you can run through all combinations and get the eventual output:

{names: {1: "count at rank 1", 2: "Counter for rank 2"}}

You can also extend this to higher ranks, where available.

Upvotes: 2

Related Questions