Anon
Anon

Reputation: 85

Summing based on multiple rows

I have a massive csv that look like

name      year     count
Sam       2012       3   
Mike      2012       4
Jeff      2013       5
.
.
.
Sam       2012       8  
Sam       2013       8       
Jeff      2013       9

How do I use d3 to only sum the count if both name and year are the same? So, the output should be

name year sum
Jeff 2013  14
Sam  2012  11
Sam  2013  8
Mike 2012  4

I've tried this so far

var test = d3.nest()
    .key(function(d) { return d.name })
    .key(function(d) { return d.year })
    .rollup(function(v) { total: d3.sum(v, function(d) { return d.count }) })
    .object(data);

but this outputs the total as undefined.

Upvotes: 2

Views: 170

Answers (1)

Gerardo Furtado
Gerardo Furtado

Reputation: 102174

First of all, the output described in your question is not the actual output you're looking for, that's just the structure you'd like your CSV had so you could get the output you want: d3.csv, like d3.csvParse (used internally), returns an array of objects. That also applies to d3.tsv (the CSV in your question looks like a TSV... fortunately it really doesn't matter, the solution is the same be it a CSV or TSV).

That being said, do not use a d3.nest, which will be deprecated soon anyway. You also cannot use a row function, since row functions are called for every row in the CSV. So, the simplest alternative is using a pure JavaScript solution to create your new data structure.

For instance, using reduce:

const csv = `name,year,count
Sam,2012,3
Mike,2012,4
Jeff,2013,5
Sam,2012,8
Sam,2013,8
Jeff,2013,9`;

const data = d3.csvParse(csv, d3.autoType);

const newData = data.reduce(function(acc, curr) {
  const foundObject = acc.find(function(d) {
    return d.name === curr.name && d.year === curr.year;
  });
  if (foundObject) {
    foundObject.count += curr.count;
  } else {
    acc.push(curr)
  };
  return acc;
}, [])

console.log(newData);
<script src="https://d3js.org/d3.v5.min.js"></script>

Upvotes: 1

Related Questions