Y4RD13
Y4RD13

Reputation: 984

How to do this particular string sequence?

I was wondering to do a particular sequence of strings. I tried to use permutations and combinations from itertools, but I didn't figure it out how to do it. This kind of sequences always takes the next word. It is difficult to explain, but easier to understand by looking at an output example

OUTPUT EXPECTATION

Original string: 
“people are good good people are great great people are awesome people are good good people are awesome”

Sequence of Words to Check: 
1 word sequence:
“people”, “are”, “good”, “great”, “awesome”.

2 word sequence:
“people are”, “are good, “good good”, “good people”, “are great”, “great great”, “great people” ...

3 word sequence:
“people are good”, “are good good”, “good good people”, “good people are” ...

4 word sequence ... until 20 word sequence.

The code that I made is simple, and get correctly the first word secuence, but not the rest of it.

def sequences(lst):   
    for count_seq in range(1, 21):
        if count_seq == 1:
            for i in dict.fromkeys(permutations(iterable=lst, r=count_seq)):
                x = ' '.join(list(i))
                print(x)
        else:
            for i in dict.fromkeys(permutations(iterable=lst, r=count_seq)):
                x = ' '.join(list(i))
                print(x)

lst = string.split(' ')
sequences(lst=lst)

Upvotes: 0

Views: 43

Answers (1)

Jonathan Guymont
Jonathan Guymont

Reputation: 497

Try this:

x = "people are good good people are great great people are awesome people are good good people are awesome"

words = x.split()
two_gram = [' '.join(words[i:i+2]) for i in range(len(words))] 
print(two_gram)

three_gram = [' '.join(words[i:i+3]) for i in range(len(words))]
print(three_gram)

Upvotes: 1

Related Questions