Reputation: 1709
Let's say I have a dataset in the following format:
var smallTestData = [
{"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
{"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "75.9142774343491"},
{"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"} ...];
Now, if I wanted to draw a D3 or C3 chart in which to display the cummulated REVENUE per YEAR and MONTH for each CUSTOMER, I reckon I would have to end up with something like this:
[{"yearMonth":"2009 1","revenueCustomer1":158989,"revenueCustomer2":68181},
{"yearMonth":"2009 2","revenueCustomer1":171217,"revenueCustomer2":204975},
{"yearMonth":"2009 3","revenueCustomer1":38477,"revenueCustomer2":46605} ...];
Sure, this doesn't look very elegant, but that's nothing. The worst part is the aggregation of the measures (ex. REVENUE) based on multiple dimensions (ex. YEAR, MONTH, CUSTOMER), which is a pain to do for JSON data.
I have tried writing my own aggregation functions to cover this problem, but couldn't find any satisfactory solution except manually adding the values together. Can anyone guide me in the right direction? How would you go about aggregating your data to fit the kind of chart I described? Is there any readily available solution?
And since we're on the topic, how would you go about joining two or more datasets based on primary keys?
Thank you!
Upvotes: 1
Views: 4728
Reputation: 25157
var smallTestData = [
{"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
{"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "75.9142774343491"},
{"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"},
{"YEAR": "2009", "MONTH": "2", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
{"YEAR": "2009", "MONTH": "2", "CUSTOMER": "Customer2", "REVENUE": "75.9142774343491"},
{"YEAR": "2009", "MONTH": "2", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"},
{"YEAR": "2008", "MONTH": "1", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
{"YEAR": "2008", "MONTH": "1", "CUSTOMER": "Customer1", "REVENUE": "75.9142774343491"},
{"YEAR": "2008", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"},
{"YEAR": "2008", "MONTH": "2", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
{"YEAR": "2008", "MONTH": "2", "CUSTOMER": "Customer1", "REVENUE": "75.9142774343491"},
{"YEAR": "2008", "MONTH": "2", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"},
{"YEAR": "2007", "MONTH": "1", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
{"YEAR": "2007", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "75.9142774343491"},
{"YEAR": "2007", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"}
];
var nested = d3.nest()
.key(function(d) { return d.CUSTOMER; }) // nest first by customer
.key(function(d) { return d.YEAR; }) // then-by year
.key(function(d) { return d.MONTH; }) // then-by month
.rollup(function(values) {
return d3.sum(values, function(d) { return d.REVENUE; });
})
.map(smallTestData)
console.log(nested);
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.4.11/d3.min.js"></script>
Run this snippet, open the console window and see the logged result. If instead of .map(nested)
you use .entries(nested)
you get the same thing but represented as a series of nested arrays, which may make it more convenient to bind to d3 selections (using their .data()
method).
Varying the order of the .key()
functions controls the order of nesting.
Upvotes: 3
Reputation: 1976
This seems like the perfect time to bust out d3.nest()
. Have a look at https://github.com/mbostock/d3/wiki/Arrays#-nest to learn more.
For your data and what you're looking to do, you can do the following:
var nestedData = d3.nest().key(function(d) { return d.YEAR + " " + d.MONTH; })
.key(function(d) { return d.CUSTOMER; })
.rollup(function(leaves) {
return d3.sum(leaves, function(d) {
return +d.REVENUE;
});
})
.entries(smallTestData);
This will return an object similar to:
[
{
key: "2009 1",
values: [
{
key: "Customer1"
values: 1938.49488391425
},
{
key: "Customer2"
values: 175.2598842275366
}
},
{ ... }
]
The way d3.nest
works is the key
functions define the object properties that you wish to aggregate by, and the rollup
function is used to summarise all the data that matches the key
functions. Each key
function will create a new depth of nesting. If you don't have the rollup
function, then the values
for each key will just be an array of all values in your data that match the values defined in the key
functions.
Upvotes: 6