K.J.J.K
K.J.J.K

Reputation: 439

Return summary statistics from multiple arrays

I have an array of objects. I would like to get a map(?) of key,value object from 2+ arrays, instead of just one. The code below is standard nesting from d3.js. However, instead of having Species be the key, I want each of Sepal_Length, Sepal_Width, Petal_Length, Petal_Width be the keys, and the value to be an array of [q1,q2,q3,iqr,min,max]

sumDat = {
  var sumstat = d3.nest() 
    .key(function(d) { return d.Species;})
    .rollup(function(d) {
      var q1 = d3.quantile(d.map(g => g.Sepal_Length).sort(d3.ascending),.25)
      var median = d3.quantile(d.map(g => g.Sepal_Length).sort(d3.ascending),.5)
      var q3 = d3.quantile(d.map(g => g.Sepal_Length).sort(d3.ascending),.75)
      var iqr = q3 - q1
      var min = q1 - 1.5 * iqr
      var max = q3 + 1.5 * iqr
      return({q1: q1, median: median, q3: q3, interQuantileRange: iqr, min: min, max: max})
    })
    .entries(dd1)
  return sumstat
}

Linking my observable notebook. Scroll down to the sumDat cell.

Upvotes: 0

Views: 330

Answers (1)

Coola
Coola

Reputation: 3162

I am assuming you want the following output:

newsumData = Array(4) [
  0: Object {
      key: "Sepal_Length"
      values: 
      Array(6) [5.1, 5.8, 6.4, 1.3000000000000007, 3.1499999999999986, 8.350000000000001]
     }
  1: Object {key: "Sepal_Width", values: Array(6)}
  2: Object {key: "Petal_Length", values: Array(6)}
  3: Object {key: "Petal_Width", values: Array(6)}
]

To do this,

  1. Create a separate function to calculate the values and give the output in your desired format like so:
function calculateValues(d, key){
  var q1 = d3.quantile(d.map(g => g[key]).sort(d3.ascending),.25)
  var median = d3.quantile(d.map(g => g[key]).sort(d3.ascending),.5)
  var q3 = d3.quantile(d.map(g => g[key]).sort(d3.ascending),.75)
  var iqr = q3 - q1
  var min = q1 - 1.5 * iqr
  var max = q3 + 1.5 * iqr
  return {key: key, values: [q1,median,q3,iqr,min,max]};
}
  1. Create an array of the keys you would like to nest by. This can be either manually entered or deduced from the dataset. In this case lets manually type it as so:
var dataKeys = ["Sepal_Length", "Sepal_Width", "Petal_Length", "Petal_Width"]
  1. Last calculate your final data structure using a map on the dataKeys array like so:
newsumData = dataKeys.map(m => calculateValues(dd1, m));

Observable Notebook

Upvotes: 3

Related Questions