MV_81
MV_81

Reputation: 111

trying to build a lexical analysis program in Haskell

I'm starting this project and made very little progress. I'm just starting with haskell so I would like to use only functions implemented by me so that I put my brain into action.

I want to take two lists of strings, one that represents code and the other has reserved names for a given language. I would like to be able to filter said words from the first list.


resNames = ["function","while","for","const","let","this","true","false",";","=", "()", "var"]

findVal :: [String]->String->[String]
findVal z " " = z
findVal [] value = []
findVal (x:y) value = if x == value then findVal y value
                      else x:(findVal y value) 

filterResNames :: [String]->[String]->[String]
filterResNames z [] = z
filterResNames [] z = []
filterResNames (x:y) (u:v) = if x== u then findVal (x:y) (u) else x:(filterResNames y (u:v))```

This obviously doesn't work because the program stops as soon as it finds a match...

Upvotes: 2

Views: 135

Answers (2)

Yuri Kovalenko
Yuri Kovalenko

Reputation: 1365

It's better to learn the standard library by implementing it by hands, so you:

  1. Get your hands dirty while implementing it, and get some practice
  2. Learn all those useful tools from standard library, which you could use out-of-the-box when got familiar enough with them

So, you need to "filter" some list on some condition, and that's the famous filter standard function:

filter :: (a -> Bool) -> [a] -> [a]
filter _ [] = []
filter cond (x : xs) = let tail = filter cond xs
                       in if cond x
                          then x:tail
                          else tail

Here we leave only such x's, that have cond x equal True. For example, filter even [1..10] == [2,4,6,8,10].

We'll use it to filter out only reserved words. Now we need to construct the condition. So we need to check whether a word in in the reserved list. And that's standard elem function:

elem :: (Eq a) => a -> [a] -> Bool
elem _ [] = False
elem x (y:ys) = x == y || elem x ys

Here we need such called type constraint that says "you can use this only with types that support == check". String is the case, so you can use it like this:

elem "if" ["for", "if", "else", "function"] == True

And now we are ready to get the result with our tools:

filterResNames :: [String] -> [String] -> [String]
filterResNames toCheck resNames = filter (`elem` resNames) toCheck

Here we use some tricks:

  1. Every function of 2 arguments may be used as an operator like + or == if you surround it's name with ` (backticks) symbols: elem x lst is equivalent to x `elem` lst

  2. If you take the expression of form x someOperator y, enclose it in the parentheses (x someOperator y) and omit one of the operands (someOperator y), then you'll automatically get a function, that takes this "omitted" operand and "pastes" it into the expression. Some examples:

    • (+1) - take a value and increase it by 1
    • (== x) - take a value and check if it's equal to x
    • (2 ^) - given some n get the nth power of 2
    • (`elem` someList) - take a value and check if it's contained in someList

Upvotes: 1

Mau
Mau

Reputation: 14468

If i interpret your idea correctly, this could be a possible solution:

resNames = ["function","while","for","const","let","this","true","false",";","=", "()", "var"]

-- Determine whether an element is in a list
findVal :: String->[String]->Bool
findVal _ [] = False
findVal x (namesH:namesT) =
  if x == namesH
  then True
  else (findVal x namesT) 

-- Loop over input list and for each element, if not in reserved list, add to accumulator
filterResNamesAcc :: [String]->[String]->[String]->[String]
filterResNamesAcc _ [] acc = acc
filterResNamesAcc [] _ acc = acc
filterResNamesAcc (inH:inT) names acc =
  if (findVal inH names)
  then (filterResNamesAcc inT names acc)
  else (filterResNamesAcc inT names (inH:acc))

-- Invoke filterResNamesAcc with empty accumulator
filterResNames :: [String]->[String]->[String]
filterResNames inL names = reverse (filterResNamesAcc inL names [])

main = do
   print (filterResNames ["hello", "for", "the", "world"] resNames)

What we use here is a typical accumulator pattern used in functional languages. The "problem" with filterResNamesAcc is that using : to prepend to the accumulator causes the result to be reversed, hence the call to reverse.

Left 'to the reader':

  1. Implementing your own reverse or figuring out a more complex folding pattern that doesn't need it.
  2. Changing the signature of filterResNames to take a single string and do the tokenization.

Upvotes: 1

Related Questions