jkeuhlen
jkeuhlen

Reputation: 4517

Pick the most correct FromJSON instance with overlapping definitions

I have a kind of unusual use case for supporting multiple versions of a record that is communicated via JSON and has a large number of Maybe values.

data VersionedThing = V1 Thing1 | V2 Thing2 

data Thing1 = Thing { 
  name :: Maybe String,
  val1 :: Maybe String,
  val2 :: Maybe String,
}

data Thing2 = Thing { 
  name :: Maybe String,
  val3 :: Maybe String,
  val4 :: Maybe String,
} 

instance FromJSON Thing1 where 
  parseJSON (Object v) = Thing <$> v.: "name" <*> v.:? "val1" <*> v .:? "val2"

instance FromJSON Thing2 where 
  parseJSON (Object v) = Thing <$> v.: "name" <*> v.:? "val3" <*> v .:? "val4"

instance FromJSON (VersionedThing) where
  parseJSON v = (V1 <$> parseJSON v)
        `mplus` (V2 <$> parseJSON v) 

My problem is that because these records share a name field with no other requirements, JSON that represents a specific version will always be able to be parsed as another version.

For example decoding the JSON

{
  "name":"Foo",
  "val3":"Bar",
  "val4":"Baz"
}

Could yield the haskell values:

Thing1 (Just "Foo") Nothing Nothing 

or

Thing2 (Just "Foo") (Just "Bar") (Just "Baz)

Is there a way to write my FromJSON instance of VersionedThing in such a way that it always parses whichever is the "most correct" value, rather than simply using the first one to succeed?

Upvotes: 1

Views: 181

Answers (1)

Daniel Wagner
Daniel Wagner

Reputation: 152867

Here's my plan: while parsing, we'll track which keys we've looked at. Parsers that don't consume all the keys of the object will fail. Here's your preamble, fleshed out to be complete and compilable:

{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE OverloadedStrings #-}

import Control.Applicative
import Control.Monad.Writer
import Data.Aeson
import Data.Aeson.Types (Parser)
import Data.Functor.Compose
import Data.HashMap.Lazy (HashMap)
import Data.Text (Text)
import qualified Data.HashMap.Lazy as HM

data VersionedThing = V1 Thing1 | V2 Thing2 deriving (Eq, Ord, Read, Show)

data Thing1 = Thing1
    { name :: Maybe String
    , val1 :: Maybe String
    , val2 :: Maybe String
    } deriving (Eq, Ord, Read, Show)

data Thing2 = Thing2
    { name :: Maybe String
    , val3 :: Maybe String
    , val4 :: Maybe String
    } deriving (Eq, Ord, Read, Show)

Now we'll add a type for parsing and tracking at the same time, together with embeddings for "just parse without tracking" and "just track without parsing".

type ParseAndTrack = Compose Parser (Writer (HashMap Text ()))

parse :: Parser a -> ParseAndTrack a
track :: Text -> ParseAndTrack ()

parse p = Compose (pure <$> p)
track t = Compose . pure . tell $ HM.singleton t ()

We can use these two primitives to lift (.:) and (.:?) to tracked versions of themselves. We'll use the suffix & for things that parse and track.

(.:&) :: FromJSON a => Object -> Text -> ParseAndTrack a
o .:& t = track t *> parse (o .: t)

(.:?&) :: FromJSON a => Object -> Text -> ParseAndTrack (Maybe a)
o .:?& t = (Just <$> (o .:& t)) <|> pure Nothing

Finally, we'll give a top-level way to drop back down from "parse-and-track" mode to "parse-only" mode, failing if we haven't consumed all the keys available.

consumeAllOf :: Object -> ParseAndTrack a -> Parser a
consumeAllOf o p = do
    (result, keys) <- runWriter <$> getCompose p
    let unusedKeys = HM.difference o keys
    unless (null unusedKeys) . fail $
        "unrecognized keys " ++ show (HM.keys unusedKeys)
    return result

Now we can write your two parsers with the above additional tools, and things should pretty much just work. The only difference in the parsers for Thing1 and Thing2 is that we throw a consumeAllOf on front and use the tracking versions of .: and .:? in the middle.

instance FromJSON Thing1 where
    parseJSON (Object v) = consumeAllOf v $ Thing1 <$> v.:& "name" <*> v.:?& "val1" <*> v .:?& "val2"

instance FromJSON Thing2 where
    parseJSON (Object v) = consumeAllOf v $ Thing2 <$> v.:& "name" <*> v.:?& "val3" <*> v .:?& "val4"

instance FromJSON (VersionedThing) where
    parseJSON v = (V1 <$> parseJSON v)
          `mplus` (V2 <$> parseJSON v)

Try it out in ghci:

> decode "{\"name\": \"foo\", \"val1\": \"bar\"}" :: Maybe VersionedThing
Just (V1 (Thing1 {name = Just "foo", val1 = Just "bar", val2 = Nothing}))
> decode "{\"name\": \"foo\", \"val3\": \"bar\"}" :: Maybe VersionedThing
Just (V2 (Thing2 {name = Just "foo", val3 = Just "bar", val4 = Nothing}))

Upvotes: 1

Related Questions