Reputation: 2089
I have hourly data over several years (> 20 years), I would like to have some hint on how to display such a big amount of data in the browser.
I would like to display data as time series because all of the different data sets have the same format (a value at a certain time), but display different kind of information. I looked at d3js, and manage to plot all my data, that is 20 years of data or more, then use brushing to zoom in based on this very good exemple.
But the browser don't support that much of data and became extremely slow.
On the server side I use servlets to send data in json format.
I display different kind of data but all have the same format which is time and value, but displaying different kind of information.
Thanks for some advices, hints and exemples on best practices to visualize large datasets.
Upvotes: 2
Views: 2144
Reputation: 2364
An issue with using libraries such as d3.js is that it relies on SVG to create all of the data and to maintain an object to reference the data. This can obviously lead to a DOM explosion depending on your dataset size. You could sample the data before rendering it and sending it to the browser, but the granularity and accuracy could be lost. Maybe you need those non-outlier points to identify trends. It really depends on the size of your dataset though.
Assuming you have a dataset size of ~175,200 points (one for every hour in 20 years), I would suggest to you a library called ZingChart (http://www.zingchart.com). It has many different styling options but more importantly it has different rendering capabilities (SVG or canvas) that can render the amount of data you are trying to visualize. In particular, take note of the zoom function which can visualize every single point, along with the ability to add custom tags to each node.
Upvotes: 3
Reputation: 2438
Don't bring all the data on the client side.
Instead, you could implement a server side method that will look like this:
getData(startDate, endDate, maxSteps)
This method will always return at most maxSteps
records, but which records, it's totally up to you and your data. I would suggest one of the following approaches:
The following steps are common for both methods:
startDate
and endDate
maxSteps
return all of themUsing the subset of records determined by startDate
and endDate
continue with the following steps.
Method 1: get exact records from your data. Can be expensive to determine the right ones:
get records from data that are closest to the selected points
point = startDate;
stepTimeSpan = (endDate - startDate) / (maxSteps - 1); //will fail if maxSteps = 1
for (i = 0; i < maxSteps; i++)
{
records.Add(getClosestTo(point));
point = point + stepTimeSpan;
}
return records;
Method 2: return records resulted from aggregations:
maxSteps
buckets with records (by date)obtain one record from each bucket as result of an aggregation
bucketStart = startDate;
bucketTimeSpan = (endDate - startDate) / maxSteps;
for (i = 0; i < maxSteps; i++)
{
bucket = getRecordsBetween(bucketStart, bucketStart + bucketTimeSpan);
records.Add( new Record( AvgDate(bucket), AvgValue(bucket) ) );
bucketStart = bucketStart + bucketTimeSpan;
}
return records;
Call this method on client side each time the user changes the interval (using the small chart from the bottom in your example).
Play with maxSteps
value until you find the right balance between performance and detail.
Upvotes: 6