Phuoc Do
Phuoc Do

Reputation: 1344

How to handle large dataset in d3js

I have a data set of 11 MB. It's slow to load it every time the document is loaded.

d3.csv("https://s3.amazonaws.com/vidaio/QHP_Individual_Medical_Landscape.csv", function(data) {
  // drawing code...
});

I know that crossfilter can be used to slice-and-dice the data once it's loaded in browser. But before that, dataset is big. I only use an aggregation of the data. It seems like I should pre-process the data on server before sending it to client. Maybe, use crossfilter on server side. Any suggestion on how to handle/process large dataset for d3?

Upvotes: 1

Views: 5266

Answers (3)

Ketan
Ketan

Reputation: 5891

How about server side (gZip) compression. should be a few kb after compressing and browser will de-compress on the background.

Upvotes: 0

JRideout
JRideout

Reputation: 1635

  • Try simplifying the data (also suggested in the comment from Stephen Thomas)
  • Try pre-parsing the data into json. This will likely result in a larger file (more network time) but have less parsing overhead (lower client cpu). If your problem is the parsing this could save time
  • Break the data up by some kind of sharding key, such as year. Limit the to that shard and then load up the other data files on demand as needed
  • Break up the data by time, but show everything in the UI. load the charts on the default view (such as most recent timeframe) but then asynchronously add the additional files as they arrive (or when they all arrive)

Upvotes: 0

Stephen Thomas
Stephen Thomas

Reputation: 14053

Is your data dynamic? If it's not, then you can certainly aggregate it and store the result on your server. The aggregation would only be required once. Even if the data is dynamic, if the changes are infrequent then you could benefit from aggregating only when the data changes and caching that result. If you have highly dynamic data such that you'll have to aggregate it fresh with every page load, then doing it on the server vs. the client could depend on how many simultaneous users you expect. A lot of simultaneous users might bring your server to its knees. OTOH, if you have a small number of users, then your server probably (possibly?) has more horsepower than your users' browsers, in which case it will be able to perform the aggregation faster than the browser. Also keep in mind the bandwidth cost of sending 11 MB to your users. Might not be a big deal ... unless they're loading the page a lot and doing it on mobile devices.

Upvotes: 1

Related Questions