Reputation: 992
I'm having problems with grouping samples based on computation done against the value of one field. The samples are in the form of
{
"dataset": "DATASET2",
"sampleid": "ID2653",
"variables": {
// several variables, key: val
},
"bmus": {
"x": 3,
"y": 7
}
}
The samples belong to a cell in a static 9x7 grid and bmus field maps the location of a sample in the grid. Circle-shaped filters can be moved on top of the grid, and after moving the circle filters I want to be able to group together samples that are located inside the circle. The grouping would be in the form of
{
"key": {
"circles": ["circle1", "circle2"] // or [], ['circle1'], ['circle2']
},
"value": 282
}
and after that a custom reduce function would be applied to compute values based on the values of some variables.
My current setup for creating the dimension and grouping is the following:
$scope.dimension = crossfilterInst.dimension( function(d) {
return {
bmu: d.bmus,
valueOf: function() {
var ret = _.isUndefined(d.bmus) ? String(constants.nanValue) + "|" + String(constants.nanValue) : d.bmus.x + "|" + d.bmus.y;
// for NaN's the result is "-100|-100", otherwise in the form of "5|4"
return ret;
}
};
});
var group = $scope.dimension.group( function(d) {
return {
circles: function() {
return FilterService.inWhatCircles(d.bmu);
},
valueOf: function() {
return String(this.circles());
}
};
});
At first, the groupings of group.all()
are correct. Later, when the circle filters are moved on the grid, thus affecting the selected amount of samples by applying a filterFunction
to another dimension of the crossfilterInst
and thereby affecting the return value from FilterService.inWhatCircles()
, the groups returned from group.all()
have increasingly incorrect sample counts.
Initially I thought that the return value of FilterService.inWhatCircles()
would be incorrect at some cases. After a long debug I noticed that if I recreate the dimension (dispose the old one + run the same code that creates $scope.dimension
) and then use the same code to group the samples, the resulting groups are correct.
Looking at the Crossfilter API, I suspect crossfilter does some caching that screws up my grouping, or that this requirement is not satisfied:
Like the value function, groupValue must return a naturally-ordered value; furthermore, this order must be consistent with the dimension's value function!
Long story short: how can I compute FilterService.inWhatCircles()
based on the bmus
field of the samples and group them together repeatedly when the function return value can change, without having to recreate the dimension every time to get the correct groups?
Upvotes: 0
Views: 90
Reputation: 20150
You guessed it: crossfilter is indexing your data in ways that will not allow you to use group or dimension filters that dynamically compute anything. The keys must already be present in the data and cannot change. They will only be read once.
The fact that your group keys are not consistent with your dimension keys is the least of your problems.
Instead, you may want to use a fake group, an object with a .all()
method that loops over all data points and counts the number that match each inWhatCircles
value, e.g. by using a map. Then it should return the same kind of array of {key,value} pairs that group.all
returns.
This will be more efficient than creating a new dimension each time, with all of its indices, and you won't lose much of the benefit of crossfilter, because it is not optimized for this sort of dynamic calculation anyway.
Upvotes: 1