AWK - count the number of syllables in a word

Question

I need to know if the words are monosyllabic or polysyllabic. The way I am going to apply to find out is by counting the number of blocks of vowels.

I tried with this regex, but does not work well with all the words number_of_vowels=match($1,"[aouöüeiáóúőűéí]?[aouöüeiáóúőűéí]");

In

könyvtaár
könyvter
hozzászóles
mű
cikk
ős

Desired output

könyvtaár    2    polysyllabic
könyvter    2     polysyllabic   
hozzászóles    4    polysyllabic
mű    1    monosyllabic
cikk    1    monosyllabic
ős    1    monosyllabic

Now I'm using this regex

a=match($1,"[aouöüeiáóúőűéí]+");

And for this word "hozzászóles" it's giving me 2, not 4.

For more information, these are de consonats "b c cs d dz dzs f g gy h j k l ly m n ny p q r s sz t ty v w x y z zs"

Ed Morton · Accepted Answer

If you want to use an awk function to count occurrences of a regep (e.g. if it's part of a larger script) then you need to use split() or gsub(), not match():

$ awk '{a=split($0,t,/[aouöüeiáóúőűéí]+/); print $0, a-1, (a>2?"poly":"mono")"syllabic"}' file
könyvtaár 2 polysyllabic
könyvter 2 polysyllabic
hozzászóles 4 polysyllabic
mű 1 monosyllabic
cikk 1 monosyllabic
ős 1 monosyllabic

$ awk '{t=$0; a=gsub(/[aouöüeiáóúőűéí]+/,"",t); print $0, a, (a>1?"poly":"mono")"syllabic"}' file
könyvtaár 2 polysyllabic
könyvter 2 polysyllabic
hozzászóles 4 polysyllabic
mű 1 monosyllabic
cikk 1 monosyllabic
ős 1 monosyllabic

but if you don't need a function to do it then just use @anubhava's approach.

AWK - count the number of syllables in a word

Answers (2)

Related Questions