Reputation: 55678
I'm using Mike Bostock's crossfilter library to filter and sort large datasets. My problem: Given a dataset with multiple dimensions, how can I sort on more than one dimension at a time?
Test dataset:
[
{ cat: "A", val:1 },
{ cat: "B", val:2 },
{ cat: "A", val:11 },
{ cat: "B", val:5 },
{ cat: "A", val:3 },
{ cat: "B", val:2 },
{ cat: "A", val:11 },
{ cat: "B", val:100 }
]
Example of desired output, sorting by cat, val
(ascending):
[
{ cat: "A", val:1 },
{ cat: "A", val:3 },
{ cat: "A", val:11 },
{ cat: "A", val:11 },
{ cat: "B", val:2 },
{ cat: "B", val:2 },
{ cat: "B", val:5 },
{ cat: "B", val:100 }
]
The approach I've used thus far is to use string concatenation on the desired dimensions:
var combos = cf.dimension(function(d) { return d.cat + '|' + d.val; });
This works fine with multiple string-based dimensions, but won't work with numeric dimensions, as it's not a natural sort ('4' > '11'
). I think I could make this work with zero-padding on the numbers, but this could get expensive for a large dataset, so I'd prefer to avoid it. Is there another way that might work here, using crossfilter?
Bonus points for any solution that allows different dimensions to have different sort directions (ascending/descending).
Clarification: Yes, I may need to switch to a native Array.sort
implementation. But the whole point of using crossfilter is that it's very, very fast, especially for large datasets, and it caches sort order in a way that makes repeated sorts even faster. So I'm really looking for a crossfilter-based answer here.
Upvotes: 4
Views: 4052
Reputation: 4668
I haven't tested for preformance but you could give d3.nest a go. Example code:
var nested = d3.nest()
.key(function(d) { return d.cat; })
.sortKeys(d3.ascending)
.sortValues(compareValues)
.entries(data);
See the whole fiddle here: http://jsfiddle.net/RFontana/bZX7Q/
And let me know what result you get if you run some jsperf :)
Upvotes: 0
Reputation: 55678
Here's what I ended up doing:
I convert the measure to a positive, comparable decimal before turning it into a string, using crossfilter to get the min/max:
var vals = cf.dimension(function(d) { return d.val }),
min = vals.bottom(1)[0].val,
offset = min < 0 ? Math.abs(min) : 0,
max = vals.top(1)[0].val + offset,
valAccessor = function(d) {
// offset ensures positive numbers, fraction ensures sort order
return ((d.val + offset) / max).toFixed(8);
},
combos = cf.dimension(function(d) {
return d.cat + '|' + valAccessor(d);
});
See working fiddle: http://jsfiddle.net/nrabinowitz/cQXNK/9/
This has the advantage of handling negative numbers properly - not possible with zero-padding, as far as I can tell. It seems to be just as fast. The downside is that it requires creating a new dimension on the numeric column, but in my case I usually require that in any case.
Upvotes: 1
Reputation: 2870
Using the Array.prototype.sort
, you can:
function sortByPriority(a, b) {
var p = sortByPriority.properties;
function pad (str, max) {
str = String(str);
return str.length < max ? pad("0" + str, max) : str;
}
if (!p) {
return a - b;
}
var ar ='', br = '';
for (var i = 0, max = p.length; i < max; i++) {
ar += pad(a[p[i]], 10);
br += pad(b[p[i]], 10);
}
return ar == br ? 0 : ar > br ? 1 : - 1;
}
How to use:
Sorting cat
then val
sortByPriority.properties = ['cat', 'val'];
myArray.sort(sortByPriority);
Result:
if you want prior val
do:
sortByPriority.properties = ['val', 'cat'];
myArray.sort(sortByPriority);
Result:
Not a super effective code but, you can improve it.
UPDATE:
You can use the pad
function to get same results using crossfilter, look this jsfiddle.
var combos = cf.dimension(function(d) {
return pad(d.cat, 10) + '|' + pad(d.val, 10);
});
You also can change the pad size by the same length from the biggest string in your "coll", this will ensure the result ever.
See that optimization: http://jsfiddle.net/gartz/cQXNK/7/
Upvotes: 1
Reputation: 4808
I know it's not using the crossfilter library, but why not use the sort function to do this?
var combos = cf.sort(function(a,b) {
if(a.cat == b.cat) return a.val < b.val ? -1 : 1;
return a.cat < b.cat ? -1 : 1;
});
see http://jsfiddle.net/cQXNK/5/
To allow different dimensions to have different sort directions would just be a matter of swapping -1 for 1 and vice versa
Upvotes: 2