George L
George L

Reputation: 1738

Building a histogram faster

I am working with a large dataset that I need to build a histogram of. I feel like my method of just going through the entire list and marking in a second array the frequency is a slow approach. Any suggestions on how to speed the process up?

Upvotes: 0

Views: 962

Answers (1)

BraveNewCurrency
BraveNewCurrency

Reputation: 13065

Given that a histogram is a graph containing the counts of all items in each bin, you can't make one without visiting all the items.

However, you can:

  1. Create the histogram as you collect the data. Then it takes no time to generate.

  2. Break up the data into N parts, and work on each part in parallel. When each part is done counting, just sum the results for each bin. (You can also combine this with #1)

  3. Sample the data. In theory, looking at a fraction of your data, you should be able to estimate the rest of it. The Math.

Upvotes: 3

Related Questions