2964349
2964349

Reputation: 165

removing the stemming words using haskell

i am new to haskell and functional programing..

my aim is to remove the stemming words from the given string..

eg: input is : "he is fishing and catched two fish"
    output is : "he is fish and catch two fish"

i tried to do this with the following code. it removes only the "ed" and it does not remove "ing".

removeStemming :: String -> String
removeStemming xs
  | "ing" `isSuffixOf` xs = take (length xs - 3) xs
  | "ed"  `isSuffixOf` xs = take (length xs - 2) xs
  | otherwise             = xs

can anyone help me to fix this error. please..

Upvotes: 0

Views: 144

Answers (3)

wit
wit

Reputation: 1622

If we wish to remove all "ing" and "ed", let's change Zeta answer such way:

removeStemming :: String -> String
removeStemming []                    = []
removeStemming ('i':'n':'g':' ':xs)  = removeStemming xs
removeStemming ('e':'d':' '    :xs)  = removeStemming xs
removeStemming "ing"                 = []
removeStemming "ed"                  = []
removeStemming (x:xs)                = x : removeStemming xs

I divided removeStemming "ing" and removeStemming ('i':'n':'g':' ':xs) if we don't want to reduce "ringing" to "r"

Upvotes: 1

Zeta
Zeta

Reputation: 105905

I already said in the original answer, that it gets easier if your tackling one word at a time:

Tackle one word at a time, this makes things much easier:

removeStemming :: String -> String
removeStemming []        = []
removeStemming (x:"ing") = [x]
removeStemming (x:"ed")  = [x] --new, since ed wasn't part of the last question
removeStemming (x:xs)    = x : removeStemming xs

If you have a look at the definition of removeStemming, you'll notice that it will remove only the very last stemming. Therefore, removeStemming is meant for a single word.

If you want to apply it onto many words, you need to apply it on every single word:

removeAllStemmings :: String -> String
removeAllStemmings = unwords . map removeStemming . words

After this you can use removeAllStemmings "he is fishing and catched two fish".

Upvotes: 2

bheklilr
bheklilr

Reputation: 54068

The problem is that you are applying removeStemming to the entire string, when you want to apply it to each word. You can do

> unwords $ map removeStemming $ words "he is fishing and catched two fish"
"he is fish and catch two fish"

The words function splits the string on whitespace and returns a list of all the words, and unwords performs the opposite action (note: in general unwords . words is not equivalent to id). You can map your removeStemming function as it is to the output of words text, then join them back together with unwords

Upvotes: 3

Related Questions