Warren
Warren

Reputation: 803

Chunking English words into graphemes corresponding to distinct sounds

How to convert english input word into combinations of graphemes? Is there a library or function that does the job?

What I'm looking for is an algorithm/implementation that splits orthographic words into segments which map to phonemes. That is, the sequence of letters in a word should be broken in between distinct sounds.

To my mind, this would look something like the following:

physically --> ph-y-s-i-c-a-ll-y
psychology --> ps-y-ch-o-l-o-g-y
thrush -->     th-r-u-sh
bought --> b-ough-t
chew --> ch-ew
palm --> p-al-m

Upvotes: 1

Views: 1639

Answers (1)

dmh
dmh

Reputation: 1059

Googling for split english words into graphemes, the first result appears to be a paper about mapping English orthography onto a phonemic representation using a Machine Learning approach. This paper appears to be doing the kind of thing you're looking for.

Upvotes: 1

Related Questions