iceman
iceman

Reputation: 2130

How to return a polymorphic type in Haskell based on the results of string parsing?

TL;DR:
How can I write a function which is polymorphic in its return type? I'm working on an exercise where the task is to write a function which is capable of analyzing a String and, depending on its contents, generate either a Vector [Int], Vector [Char] or Vector [String].

Longer version:
Here are a few examples of how the intended function would behave:

Error-checking/exception handling is not a concern for this particular exercise. At this stage, all testing is done purely, i.e., this isn't in the realm of the IO monad.

What I have so far:

I have a function (and new datatype) which is capable of classifying a string. I also have functions (one for each Int, Char and String) which can convert the string into the necessary Vector.

My question: how can I combine these three conversion functions into a single function?

What I've tried:

My code:

A few comments

Code:

import qualified Prelude
import CorePrelude                   

import Data.Foldable (concat, elem, any)
import Control.Monad (mfilter)
import Text.Read (read)
import Data.Char (isAlpha, isSpace)

import Data.List.Split (split)
import Data.List.Split.Internals (Splitter(..), DelimPolicy(..), CondensePolicy(..), EndPolicy(..), Delimiter(..))

import Data.Vector ()                       
import qualified Data.Vector as V           

data VectorType = Number | Character | TextString deriving (Show)

mySplitter :: [Char] -> Splitter Char
mySplitter elts = Splitter { delimiter        = Delimiter [(`elem` elts)]
                           , delimPolicy      = Drop
                           , condensePolicy   = Condense
                           , initBlankPolicy  = DropBlank
                           , finalBlankPolicy = DropBlank }

mySplit :: [Char]-> [Char]-> [[Char]]
mySplit delims = split (mySplitter delims)           

classify :: String -> VectorType
classify xs
  | '\"' `elem` cs = TextString
  | hasAlpha cs = Character
  | otherwise = Number
  where
    cs = concat $ split (mySplitter "\n") xs
    hasAlpha = any isAlpha . mfilter (/=' ')

toRows :: [Char] -> [[Char]]
toRows = mySplit "\n"

toVectorChar ::    [Char] -> Vector [Char]
toVectorChar =   let toChar = concat . mySplit " \'" 
                 in V.fromList . fmap (toChar) . toRows

toVectorNumber  :: [Char] -> Vector [Int]
toVectorNumber = let toNumber = fmap (\x -> read x :: Int) . mySplit " "
                 in  V.fromList . fmap toNumber . toRows

toVectorString  :: [Char] -> Vector [[Char]]
toVectorString = let toString = mfilter (/= " ") . mySplit "\""
                 in  V.fromList . fmap toString . toRows

Upvotes: 9

Views: 948

Answers (2)

rampion
rampion

Reputation: 89043

Easy, use an sum type!

data ParsedVector = NumberVector (Vector [Int]) | CharacterVector (Vector [Char]) | TextString (Vector [String]) deriving (Show)

parse :: [Char] -> ParsedVector
parse cs = case classify cs of
  Number     -> NumberVector $ toVectorNumber cs
  Character  -> CharacterVector $ toVectorChar cs
  TextString -> TextStringVector $ toVectorString cs

Upvotes: 9

leftaroundabout
leftaroundabout

Reputation: 120711

You can't.

Covariant polymorphism is not supported in Haskell, and wouldn't be useful if it were.


That's basically all there is to answer. Now as to why this is so.

It's no good "returning a polymorphic value" like OO languages so like to do, because the only reason to return any value at all is to use it in other functions. Now, in OO languages you don't have functions but methods that come with the object, so it's quite easy to "return different types": each will have its suitable methods built-in, and they can per instance vary. (Whether that's a good idea is another question.)

But in Haskell, the functions come from elsewhere. They don't know about implementation changes for a particular instance, so the only way such functions can safely be defined is to know every possible implementation. But if your return type is really polymorphic, that's not possible, because polymorphism is an "open" concept (it allows new implementation varieties to be added any time later).

Instead, Haskell has a very convenient and totally safe mechanism of describing a closed set of "instances" – you've actually used it yourself already! ADTs.

data PolyVector = NumbersVector (Vector [Int])
                | CharsVector (Vector [Char])
                | StringsVector (Vector [String])

That's the return type you want. The function won't be polymorphic as such, it'll simply return a more versatile type.


If you insist it should be polymorphic

Now... actually, Haskell does have a way to sort-of deal with "polymorphic returns". As in OO when you declare that you return a subclass of a specified class. Well, you can't "return a class" at all in Haskell, you can only return types. But those can be made to express "any instance of...". It's called existential quantification.

{-# LANGUAGE GADTs #-}

data PolyVector' where
  PolyVector :: YourVElemClass e => Vector [e] -> PolyVector'

class YourVElemClass where
  ...?
instance YourVElemClass Int
instance YourVElemClass Char
instance YourVElemClass String

I don't know if that looks intriguing to you. Truth is, it's much more complicated and rather harder to use; you can't just just any of the possible results directly but can only make use of the elements through methods of YourVElemClass. GADTs can in some applications be extremely useful, but these usually involve classes with very deep mathematical motivation. YourVElemClass doesn't seem to have such a motivation, so you'll be much better off with a simple ADT alternative, than existential quantification.

There's a famous rant against existentials by Luke Palmer (note he uses another syntax, existential-specific, which I consider obsolete, as GADTs are strictly more general).

Upvotes: 16

Related Questions