python stemming words for local languages

Question

I've some problem to stem words in my local language using rule based algorithm. so any body who are python literate can help me.

In my language some words are pluralized by repeating the first 2 or 3 characters(sounds).

For example

Diimaa (root word)  ==> Diddiimaa(plural word)
Adii (root word)   ==> Adadii(plural word)

so now i want my program to reject "Did" from the first example and "Ad" from the second example

the following is my code and it did not return any result

`def compput(mm):   
    vv=1
    for i in mm:
        if seevowel(i)==1:
            inxt=mm.index(i)+1
            if inxt 0:
                return stem
        elif ((i[0] == i[2] or i[0]== i[3]) and i[1] == i[4]):
            stem = i[3:]
            if compput(self) > 0:
                return stem
       else:
           return tkn
    print(stem)`

Misganu Fekadu · Accepted Answer

This is the answer for my question posted on this page. I tried the following rule based code and it works correctly. I've checked my code with words assigned to jechoota

jechoota = "diddiimaa adadii babaxxee babbareedaa gaggaarii guguddaa hahhamaa hahapphii"

token = jechoota.split()
def stem(word):
    if(word[0] == word[2] and word[1] == word[3]):
        stemed = word[2:]
    elif(word[0] == word[2] and word[0] == word[3] and word[1] == word[4]):
        stemed = word[3:]
    return stemed
for i in token:
    print stem(i)

python stemming words for local languages

Answers (2)

Related Questions