Reputation: 984
I was wondering to do a particular sequence of strings. I tried to use permutations
and combinations
from itertools, but I didn't figure it out how to do it. This kind of sequences always takes the next word. It is difficult to explain, but easier to understand by looking at an output example
OUTPUT EXPECTATION
Original string:
“people are good good people are great great people are awesome people are good good people are awesome”
Sequence of Words to Check:
1 word sequence:
“people”, “are”, “good”, “great”, “awesome”.
2 word sequence:
“people are”, “are good, “good good”, “good people”, “are great”, “great great”, “great people” ...
3 word sequence:
“people are good”, “are good good”, “good good people”, “good people are” ...
4 word sequence ... until 20 word sequence.
The code that I made is simple, and get correctly the first word secuence, but not the rest of it.
def sequences(lst):
for count_seq in range(1, 21):
if count_seq == 1:
for i in dict.fromkeys(permutations(iterable=lst, r=count_seq)):
x = ' '.join(list(i))
print(x)
else:
for i in dict.fromkeys(permutations(iterable=lst, r=count_seq)):
x = ' '.join(list(i))
print(x)
lst = string.split(' ')
sequences(lst=lst)
Upvotes: 0
Views: 43
Reputation: 497
Try this:
x = "people are good good people are great great people are awesome people are good good people are awesome"
words = x.split()
two_gram = [' '.join(words[i:i+2]) for i in range(len(words))]
print(two_gram)
three_gram = [' '.join(words[i:i+3]) for i in range(len(words))]
print(three_gram)
Upvotes: 1