j_tum
j_tum

Reputation: 67

How to prepare data for d3 stacked barchart using rollup

I am new to JavaScript and D3.js and atm trying to create an Angular App with a stacked bar chart visualisation of the number of different test results 'OK', 'NOK' and 'Aborted' of each day as a stacked bar in the y-axis and the dates as the x-axis.

My data.csv looks like this:

Date;Result
20-05-2021 17:54:02;Aborted
20-05-2021 17:55:24;OK
21-05-2021 21:48:45;NOK
22-05-2021 17:55:24;OK
22-05-2021 17:54:02;Aborted
22-05-2021 17:55:24;OK

Since I need to count the results per day I first parse the date into the right format using timeParse and then I use the timeDay.floor method to get rid of the time:

let jsonObj = await d3.dsv(";", "/assets/data.csv", function (d) {

  let time = timeParse("%d-%m-%Y %-H:%M:%S")(d['Date'])
  let date = timeDay.floor(new Date(time));

  return {
    Date: date,
    Result: d.Result
  };
})

If I understand correctly, this gives me an array with < date | string > for every test result.

To now summarize the counts of the test results of identical days I use rollup:

let data_count = rollup(jsonObj, v => v.length, d => d.Date, d => d.Result)

Now I have a nested Map with the date (only one for each day) as key and the values as the test result each with the summed number for that day.

I foud several examples on how to continue that don't need to use the rollup method e.g. here and tried to adapt it to my map:

let processed_data = data_count.map( d => {
  let y0 = 0;
  let total = 0;
  return {
    date: d.key,
    //call second mapping function on every test-result
    values: (d.valules).map( d => {
      let return_object = {
        result: d.key,
        count: d.values,
        y0: y0,
        y1: y0 + d.values
      };
    //calculate the updated y0 for each new test-result on a given date
    y0 = y0 + d.values;
    //add the total for a given test-result to the sum total for that test-result
    total = total + d.values;
    return return_object;  
    }),
    total: total
  };
});

but I am getting an error:

Property 'map' does not exist on type 'InternMap<Date, InternMap<string, number>>'.ts(2339)

I understand that the map function can't be used on maps, I guess. I also tried to rewrite this part into separate functions to not use the map function but it doesn't work either. Maybe I have some syntax error or something but I get:

TypeError: Cannot read property 'key' of undefined

I probably need to use the get() method for the values of a map, but not sure how to implement it.

Now I am stuck as to how I should continue to prepare the data for the stacked bar chart. In this example from bl.ocks.org the CSV looks different. I was thinking about somehow manipulating my data to fit that example's shape:

Date;OK;NOK;Aborted
20-05-2021 17:54:02;1;0;1
21-05-2021 21:48:45;0;1;0
22-05-2021 17:55:24;2;0;1

But I have no idea how to go about it. Any help as to how to prepare my data would be welcome. Maybe I should not use the rollup method, but I feel like it's a perfect fit for my needs.

Upvotes: 2

Views: 773

Answers (1)

Robin Mackenzie
Robin Mackenzie

Reputation: 19319

If we run with the idea that you ultimately want to leverage some existing code examples and therefore need data shaped like this :

Date;OK;NOK;Aborted
20-05-2021 17:54:02;1;0;1
21-05-2021 21:48:45;0;1;0
22-05-2021 17:55:24;2;0;1

There are several things to consider:

  1. You are converting your data from dense to sparse in that you need to create a zero data point for e.g. NOK on 20-05-2021 because that data point did not exist in the original data.

  2. You need the distinct of the Result values as row headers in the transformed data. You can get this with: const columns = [...new Set(data.map(d => d.Result))];

  3. You found that you can't use Array.protoype.map on a Map object, so you just need to consider the other options (two presented below) for iterating over Map objects e.g. use Map.prototype.entries() or Map.prototype.forEach.

To achieve this re-shaped data with d3.rollup:

  • Wrap the d3.rollup statement in Array.from(...) which gets you Map.prototype.entries() which you can pass to a reduce function.

  • In the reduce function you can then access the [key, value] pairs of the outer Map where value is itself a Map (nested by d3.rollup)

  • Then iterate columns (the distinct of Result) in order to assess if you need to either take the value in the inner Map (aggregate of that days Result) or insert a 0 because that Result did not occur on that day (per point (1)).

    In the example, this line:

    resultColumns.map(col => row[col] = innerMap.has(col) ? innerMap.get(col) : 0);

    Means: for a column header, if the inner Map has that column header as a key, then get the value per that key, otherwise it is zero.

Working example:

// your data setup
const csv = `Date;Result
20-05-2021 17:54:02;Aborted
20-05-2021 17:55:24;OK
21-05-2021 21:48:45;NOK
22-05-2021 17:55:24;OK
22-05-2021 17:54:02;Aborted
22-05-2021 17:55:24;OK`;

// your data processing
const data = d3.dsvFormat(";").parse(csv, d => {
  const time = d3.timeParse("%d-%m-%Y %-H:%M:%S")(d.Date);
  const date = d3.timeDay.floor(new Date(time));
  return {
    Date: date,
    Result: d.Result
  }
});

// distinct Results for column headers per point (2)
const resultColumns = [...new Set(data.map(d => d.Result))];

// cast nested Maps to array of objects
const data_wide = Array.from( // get the Map.entries()
  d3.rollup(
    data,
    v => v.length,
    d => d.Date,
    d => d.Result
  )
).reduce((accumlator, [dateKey, innerMap]) => {
  // create a 'row' with a Date property
  let row = {Date: dateKey}
  // add further properties to the 'row' based on existence of keys in the inner Map per point (1)
  resultColumns.map(col => row[col] = innerMap.has(col) ? innerMap.get(col) : 0);
  // store and return the accumulated result
  accumlator.push(row);
  return accumlator;
}, []);

console.log(data_wide);
.as-console-wrapper { max-height: 100% !important; top: 0; }
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/6.7.0/d3.min.js"></script>

If you prefer a more procedural (and maybe more readable) way to get to the same outcome you can avoid Array.from(...) and use the Map's forEach method (which is different from Array.prototype.forEach) to iterate the outer Map and then perform a similar operation to assess if data points of zero need to be created:

// your data setup
const csv = `Date;Result
20-05-2021 17:54:02;Aborted
20-05-2021 17:55:24;OK
21-05-2021 21:48:45;NOK
22-05-2021 17:55:24;OK
22-05-2021 17:54:02;Aborted
22-05-2021 17:55:24;OK`;

// your data processing
const data = d3.dsvFormat(";").parse(csv, d => {
  const time = d3.timeParse("%d-%m-%Y %-H:%M:%S")(d.Date);
  const date = d3.timeDay.floor(new Date(time));
  return {
    Date: date,
    Result: d.Result
  }
});

// distinct Results for column headers per point (2)
const resultColumns = [...new Set(data.map(d => d.Result))];

// rollup the data
const rolled = d3.rollup(
  data,
  v => v.length,
  d => d.Date,
  d => d.Result
);

// create an output array
const data_wide = [];

// populate the output array
rolled.forEach((innerMap, dateKey) => {
  // create a 'row' with the date property
  const row = {Date: dateKey}
  // add further properties to the 'row' based on existence of keys in the inner Map per point (1)
  resultColumns.map(col => row[col] = innerMap.has(col) ? innerMap.get(col) : 0);
  // store the result
  data_wide.push(row);
});

console.log(data_wide);
.as-console-wrapper { max-height: 100% !important; top: 0; }
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/6.7.0/d3.min.js"></script>

Upvotes: 2

Related Questions