carpemb
carpemb

Reputation: 701

Parsing nested heterogeneous JSON arrays with Aeson

So, I've hit something of a roadblock with parsing the following JSON with the Haskell Aeson library.

So say I have the following:

"packetX_Name": [
  "container",
  [
    {
      "field1": "value1",
      "field2": "value2"
    },
    {
      "field1": "value3",
      "field2": "value4"
    },
    {
      "field1": "value5",
      "field2": "value6"
    }
  ]
],
"packetY_Name": [
  "container",
  [
    {
      "field1": "value7",
      "field2": "value8"
    },
    {
      "field1": "value9",
      "field2": "value10"
    }
  ]
],
etc...

And I would ideally like to parse this using data types like this:

data ExtractedPacket = ExtractedPacket
  { packetName   :: String
  , packetFields :: [ExtractedPacketField]
  } deriving (Show,Eq)

instance FromJSON ExtractedPacket where
  parseJSON = blah

data ExtractedPacketField = ExtractedPacketField
  { field1  :: String
  , field2 :: String
  } deriving (Show,Eq)

instance FromJSON ExtractedPacketField where
  parseJSON = blah

And get something like the following:

ExtractedPacket
  "packetX_Name"
  [ ExtractedPacketField "value1" "value2"
  , ExtractedPacketField "value3" "value4"
  , ExtractedPacketField "value5" "value6"
  ]

ExtractedPacket
 "packetY_Name"
  [ ExtractedPacketField "value7" "value8"
  , ExtractedPacketField "value10" "value10"
  ]

This JSON example is describing network packets and each packet has a different name (such as "packetX_Name") that can't be parsed the same way "field1" or "field2" can be. It'll be different every time. Most of the Aeson examples out there are quite unhelpful when it comes to situations like this. I've noticed a function in the API docs called withArray that matches on a String, but I'm at a lose as to what to use for (Array -> Parser a)

The part I'm really stuck on is parsing the heterogeneous array that starts with a String "container" and then has an array with all the objects in it. Thus far, I've been indexing straight to the array of objects, but the type system started to become a real labyrinth and I found it really hard to approach this in a way that isn't ugly and hackish. On top of this, Aeson doesn't produce very helpful error messages.

Any ideas on how to approach this?

Upvotes: 1

Views: 515

Answers (1)

hao
hao

Reputation: 10238

In more complicated examples like these, it's good to keep in mind that underneath the Aeson Value type are simple data structures – Vector for arrays and HashMap for objects. A little more exotic than the lists and maps we're used to dealing with, but still data structures that have Foldable and Traversable instances. With that in mind, we can declare these instances:

{-# LANGUAGE OverloadedStrings #-}

import qualified Control.Lens as Lens
import qualified Data.Foldable as Foldable
import qualified Data.Text.Strict.Lens as Lens
import           Data.Aeson
import           Data.Aeson.Types

newtype ExtractedPackets =
  ExtractedPackets [ExtractedPacket] deriving (Show)

instance FromJSON ExtractedPackets where
  parseJSON (Object o) = do
    let subparsers =
          [ ExtractedPacket (Lens.view Lens.unpacked key) <$> parseJSON packets
          | (key, Array values) <- Lens.itoList o
          , packets@(Array _) <- Foldable.toList values]
    packets <- sequence subparsers
    return (ExtractedPackets packets)
  parseJSON invalid =
    typeMismatch "ExtractedPackets" invalid

instance FromJSON ExtractedPacketField where
  parseJSON (Object o) =
    ExtractedPacketField <$> o .: "field1" <*> o .: "field2"
  parseJSON invalid =
    typeMismatch "ExtractedPacketField" invalid

We have to newtype the list of packets because there's already a FromJSON instance for FromJSON a => FromJSON [a] and it doesn't do what we want (it, specifically, is only equipped to deal with homogeneous lists).

Once we do that, we can get our hands on the hashmap inside the object and traverse its keys and values as tuples. Mapping over the tuples, we produce a [Parser ExpectedPacket], which we can sequence into a Parser [ExpectedPacket]. I'm using lens liberally here to do the boring stuff, like converting between packed and unpacked strings or breaking down the hashmap into key-and-value tuples. You can use the text and unordered-containers packages to achieve the same goals if you don't want to pull in lens.

It seems to work on the example provided:

λ> eitherDecode bytes :: Either String ExtractedPackets
Right (ExtractedPackets [ExtractedPacket {packetName = "packetX_Name",
packetFields = [ExtractedPacketField {field1 = "value1", field2 =
"value2"},ExtractedPacketField {field1 = "value3", field2 =
"value4"},ExtractedPacketField {field1 = "value5", field2 =
"value6"}]},ExtractedPacket {packetName = "packetY_Name", packetFields
= [ExtractedPacketField {field1 = "value7", field2 =
"value8"},ExtractedPacketField {field1 = "value9", field2 =
"value10"}]}])

Lastly, I often find that using typeMismatch and eitherDecode to be tremendously helpful for debugging Aeson instances.

Upvotes: 2

Related Questions