Nate Glenn
Nate Glenn

Reputation: 6744

multi-histogram plot with Vegalite

I would like to create single visual showing multiple histograms on it. I have simple arrays of values, like so:

"data": {"values": {"foo": [0,0,0,1,1,1,2,2,2], "baz": [2,2,2,3,3,3,4,4,4]}}

I want to use different color bars to show the spread of values for "foo" and "baz". I am able to make a single histogram for "foo" like so:

{
  "data": {"values": {"foo": [0,0,0,1,1,1,2,2,2]}},
  "mark": "bar",
  "transform": [{"flatten": ["foo"]}],
  "encoding": {
    "x": {"field": "foo", "type": "quantitative"},
    "y": {"field": "foo", "type": "quantitative", "aggregate": "count"}
  }
}

However, I cannot find the correct way to flatten out the arrays. This doesn't work:

{
  "data": {"values": {"foo": [0,0,0,1,1,1,2,2,2], "bar": [0,0,0,1,1,1,2,2,2]}},
  "mark": "bar",
  "transform": [{"flatten": ["foo", "baz"]}],
  "encoding": {
    "x": {"field": "foo", "type": "quantitative"},
    "y": {"field": "foo", "type": "quantitative", "aggregate": "count"}
  },
  "layer": [{
    "mark": "bar",
    "encoding": {
      "y": {"field": "baz", "type": "quantitative", "aggregate": "count"}
    }
  }]
}

https://vega.github.io/editor/#/url/vega-lite/N4IgJghgLhIFygG4QDYFcCmBneoBmA9gfANoAMANJZQIwV10BMFzjAuhSAEYQBep1KvWFMWLNgF8JnALYQATgGt43BSE5R5EAHZZC8maXwpoUDNtIhCxTj36SOIcwGMCYAJbaA5rhAAPXzx3DBQwFWt1ECgATwAHDBUARzQdKHcYNMQE6RBowODQ8KJImPiklO00jPcsyIgvL3kML2gEuBBXNEqQKU4TaIx5IxA5JRUeIc4XN08fBFz8kLD2uxK4tpBk1PToGoTOesbm1pVO7qkJSSA

Inspecting data_0, there is are columns for foo and its counts, but nothing for baz.

This doesn't work, either:

{
  "data": {
    "values": {
      "foo": [0, 0, 0, 1, 1, 1, 2, 2, 2],
      "baz": [0, 0, 0, 1, 1, 1, 2, 2, 2]
    }
  },
  "mark": "bar",
  "transform": [{"flatten": ["foo"]},{"flatten": ["baz"]}],
  "encoding": {
    "x": {"field": "foo", "type": "quantitative"},
    "y": {"field": "foo", "type": "quantitative", "aggregate": "count"}
  },
  "layer": [
    {
      "mark": "bar",
      "encoding": {
        "y": {"field": "baz", "type": "quantitative", "aggregate": "count"}
      }
    }
  ]
}

https://vega.github.io/editor/#/url/vega-lite/N4IgJghgLhIFygG4QDYFcCmBneoBmA9gfANoAMANJZQIwV10BMFzjAuhSAEYQBep1KvWFMWLNgF8JnALYQATgGt43BSE5R5EAHZZC8maXwpoUDNtIhCxSRWOnzlnv0kcQ5gMYEwAS20BzXBAADyC8HwwUMBVrdRAoAE8ABwwVAEc0HSgfGGzEVOkQBLCIqJiiOMSU9MztbNyffLiIf395DH9oVLgQLzQ6kClOEwSMeSMQOSUVHnHOT28-QIQiksjonudK5O6QDKyc6EbUzha2jq6VPoGpCUkgA

That still only gives columns for foo and its count, but now the count is 27 for each bucket!

How can I accomplish a multi-histogram graphic starting with array data?

Upvotes: 4

Views: 581

Answers (1)

jakevdp
jakevdp

Reputation: 86310

You can do this using a flatten transform followed by a fold transform, and then use a color encoding to separate the two datasets. For example (open in editor):

{
  "data": {
    "values": {
      "foo": [0, 0, 1, 1, 1, 1, 2, 2, 2],
      "baz": [4, 4, 5, 5, 6, 6, 6, 6, 7]
    }
  },
  "transform": [{"flatten": ["foo", "baz"]}, {"fold": ["foo", "baz"]}],
  "mark": "bar",
  "encoding": {
    "x": {"field": "value", "type": "quantitative"},
    "y": {
      "field": "value",
      "type": "quantitative",
      "aggregate": "count",
      "stack": null
    },
    "color": {"field": "key", "type": "nominal"}
  }
}

enter image description here

As an aside, your layer approach also works if you put the encodings in separate layers, so that the outer foo aggregate doesn't clobber the baz data, but it's a bit more verbose than the approach based on fold:

{
  "data": {
    "values": {
      "foo": [0, 0, 1, 1, 1, 1, 2, 2, 2],
      "baz": [4, 4, 5, 5, 6, 6, 6, 6, 7]
    }
  },
  "transform": [{"flatten": ["foo", "baz"]}],
  "layer": [
    {
      "mark": {"type": "bar", "color": "orange"},
      "encoding": {
        "x": {"field": "foo", "type": "quantitative"},
        "y": {"field": "foo", "type": "quantitative", "aggregate": "count"}
      }
    },
    {
      "mark": "bar",
      "encoding": {
        "x": {"field": "baz", "type": "quantitative"},
        "y": {"field": "baz", "type": "quantitative", "aggregate": "count"}
      }
    }
  ]
}

Upvotes: 1

Related Questions