Haskell Parameter Pattern Matching

Question

I am trying to create a complier using Haskell as part of my university coursework.

I want to create a method that matches any string like this:

int a = 5
int foo = 3

So this is the method I created:

readInstruction :: String -> String
readInstruction ( 'i' : 'n' : 't' : ' ' : varName : ' ' : '=' : ' ' : val : []) = 
    "Declare Int " ++ [varName] ++ " = " ++ [val]

However this only works for variable names of 1 letter. How should I do this?

Also, as a side note, I also noticed the following does not compile:

   readInstruction ( "int " ++ varName ++ " = " ++ val ) = 
        "Declare Int " ++ varName ++ " = " ++ val

Why?

Please note that I'm new to Haskell and only know the basics. I don't know any other library functions and would prefer not to use them (as I have been discouraged to use them for my coursework).

bheklilr · Accepted Answer

When you're pattern matching, you can only pattern match on constructors. For lists, your two constructors are : and [], whereas ++ is a function on lists. The compiler can't work backwards from a function application, but it can from a constructor application (a very special kind of function that even lives in its own namespace in Haskell).

A much better alternative to this would be to tokenize your input, this will prevent errors from having insufficient patterns, and will be much easier to process in the long run. Particularly since you're wanting to write a compiler, you should use a tokenizer as this is pretty much the accepted way to write parsers. You could instead have

-- A very simple tokenizer, only splits on whitespace
-- so `int x=1` won't be tokenized correctly
tokenize :: String -> [String]
tokenize = words

readInstructions :: [String] -> (String, [String])
readInstructions ("int" : varName : "=" : val : rest) = ("Declare Int" ++ varName ++ " = " ++ val, rest)
readInstructions otherPatterns = undefined

The reason why I return (String, [String]) is so that you could iteratively apply readInstructions and have it only consume the number of tokens it needs for each command. So you could do

main = do
    program <- readFile "myProgram.prog"
    let tokens = tokenize program
        (firstInstr,   tokens') = readInstructions tokens
        (secondInstr, tokens'') = readInstructions tokens'
    putStrLn firstInstr
    putStrLn secondInstr

If you think this looks laborious, you'd be correct. This is because there are much better ways of handling this sort of thing in Haskell, and quite elegantly too. Once you've completed your assignment, I would encourage you to look at the Parsec library, and the State monad. The Parsec library specifically has a lot of work done for you in terms of writing a tokenizer and turning those tokens into something meaningful, and the State monad is what the library is really built on top of. Having a good understanding of the State monad will help you as a Haskell programmer in general, as it is used a lot for many different problems.

Haskell Parameter Pattern Matching

Answers (2)

Related Questions