Reputation: 35
Hello I'm creating a word rhymer for complete words but I'm stuck on the logic.
I imported the cmu dictionary text into two columns, word and arpabet. And this is what I have so far:
Code:
Select word, arpabet from cmudict
where word LIKE "%dove" and substring_index(arpabet, ' ', -2) in
(Select substring_index(arpabet, ' ', -2) from cmudict);
Output (1 row returned):
word arpabet
'DOVE', 'D AH1 V'
I'm looking for words whose arpabet matches the 'AH1 V' from within the last 2 spaced delimiters so it would give me a list of words that would sound like 'DOVE' while ignoring the stress number (ah1, ah2):
Output I'm looking for:
I've tried changing the LIKE '%ove' but that produces a list of correct and incorrect matches with a similar suffix from the word column since it ignores the corresponding arpabet.
Any help would be great.
Upvotes: 2
Views: 89
Reputation: 6161
If you're trying to get words that match the sound, then I believe you should match with the column arpabet
and not with word
.
SELECT word, arpabet FROM cmudict
WHERE REPLACE(REPLACE(REPLACE(substring_index(arpabet, ' ', -2),'0',''),'1',''),'2','') in
(SELECT REPLACE(REPLACE(REPLACE(substring_index(arpabet, ' ', -2),'0',''),'1',''),'2','')
FROM cmudict WHERE word = "dove");
Breaking it down:
REPLACE(REPLACE(REPLACE(
will eliminate the 0,1,2 from any of the sounds. Since you don't want to match with the numbers (stress). Looks kinda ugly, but it gets it done.substring_index(arpabet, ' ', -2)
stays the same, but now it's inside the REPLACEs.Upvotes: 1