Reputation: 2543
I'm trying to display some data points using google graphs, but unfortunately there is a limit of about 2000 characters to the length of the url I can use, which translates to roughly 200 data points limit I can use to display the graph. I have about 800 data points and growing, so I need to cut them down to 200 for the graph. Right now I'm just cutting out X=(800/200)-1 points then skipping one (and repeat) to get to 200.
However most data points are located at the beginning of the array since the points' positions on the graphs are expanding somewhat exponentially (about 1.2 exponent). Also the most important points are the most recent ones (at the end of the array). So I'd need a way to reduce the points array in such a way as to leave most points at the end of the array, and remove most (but not all) towards the beginning.
This would be used each time the graph is made so it would have to be deterministic (i.e. no random involved). If someone could point me in the right direction I'd very much appreciate it.
Upvotes: 2
Views: 623
Reputation: 65854
How about this? Not having PHP to hand, I've used Python, but I hope it's clear. Ask if not.
Let ℓ be the number of values you have to start with, and n be the number you want to cut it down to. Then the idea is to find the largest exponent x such that nx is less then ℓ. Then we can pick the items with indexes that are the closest integers to
ℓ − (n − 1)x − 1, ℓ − (n − 2)x − 1, ..., ℓ − 1x − 1, ℓ − 0x − 1
which are spaced out nicely with a bias towards the end of the list.
import math
def select_with_bias(s, n):
"""Select n values from the list s if possible, with bias to later values."""
l = len(s)
if l <= n:
return s[:] # List is short: return copy of whole list.
if n < 2:
return s[-n:] # If n is 1, last item only; if n is 0, empty list.
x = math.log(l - 1, n) # Shorthand for log(l - 1) / log(n)
result = []
for i in xrange(n - 1, -1, -1): # Loop from n-1 down to 0.
result.append(s[l - int(i ** x) - 1])
return result
(For Python experts: this isn't the most idiomatic way to do it in Python, but I wanted to make it as clear as I can to a programmer who doesn't know Python.)
For example:
>>> select_with_bias(range(100), 10)
[19, 36, 51, 64, 75, 84, 91, 96, 98, 99]
>>> select_with_bias(range(100), 20)
[8, 15, 22, 29, 36, 42, 48, 54, 60, 65, 70, 75, 80, 84, 88, 91, 94, 97, 98, 99]
It's easy to try out variations on this approach: the idea is to choose a curve of the right shape and scale it to fit the length of the list, so you can try out different curves. I chose a polynomial curve, but if that doesn't work out for you, you could pick a different one, for example an exponential.
Upvotes: 4