Reputation: 55
I'm currently trying to generate a list of words that rhyme with an input word according to the CMU Pronouncing dictionary I have managed to arrange all the words into a dictionary with their keys being a list of strings representing their values. However, due to something rhyming based on the last vowel, I'm sort of stuck on finding how to go about this in the case of words that contain more than one
def dotheyrhyme(filename,word):
rhymes = {}
list = []
with open(filename) as f:
text = f.readlines()[56:]
for line in text:
splitline = line.split(" ")
rhymes[str(splitline[0])] = "".join(splitline[1:])
f.close()
comparer = rhymes[word.upper()].rstrip().split(" ")
return comparer
I plan to use the comparer variable as a baseline and believe reversing this variable could also be a good way to go about it but I'm lost or overthinking ways to compare if the last vowel and letters after are the same and append accordingly?
Example:
{SECOND: 'S' 'EH1' 'K' 'AH0' 'N' 'D'}
Would rhyme with
{'AND': 'AH0' 'N' 'D'}
but these two wouldn't rhyme
{'YELLOW': 'Y' 'EH1' 'L' 'OW0'}
And
{HELLO: 'HH' 'AH0' 'L' 'OW1'}
But the methods I can't think of ways to counter varying lengths and multiple vowels.
Thanks for your help!
Upvotes: 2
Views: 365
Reputation: 16660
You would have to start comparing from the end. There are special algorithms and data structures that can help in cases like yours - you can check Aho-Corasick algorithm.
But in the simple case, you would need to compare the words in the reverse order and find common substring above some threshold to call these words a rhyme, e.g.:
def if_rhymes(word1, word2):
r1 = reverse(rhymes[word2])
r2 = reverse(rhymes[word1])
the_same = 0
for sound1, sound2 in zip(r1, r2):
if sound1 == sound2:
the_same += 1
else:
break
if the_same < threshold:
return 'no rhyme' # or False if you want
else:
return 'rhymes' # or True
What the algorithm does
rhymes
dictionary that you populated from file (for clarity I recommend doing it outside the rhyme testing function). zip
. Upvotes: 0
Reputation: 403
Finding last vowel requires you to have a set of vowels. After that you only got to iterate over the list backwards.
vowels = {...} # some list of vowels
word = ['S', 'EH1', 'K', 'AH0', 'N', 'D']
for i in word[::-1]:
if i in vowels:
last_vowel = i
break
If open to other idea you can also look at this library which finds the rhymes for you : https://pypi.org/project/pronouncing/
Upvotes: 1