Reputation: 13444
I'm looking for an algorithm that places tick marks on an axis, given a range to display, a width to display it in, and a function to measure a string width for a tick mark.
For example, given that I need to display between 1e-6 and 5e-6 and a width to display in pixels, the algorithm would determine that I should put tickmarks (for example) at 1e-6, 2e-6, 3e-6, 4e-6, and 5e-6. Given a smaller width, it might decide that the optimal placement is only at the even positions, i.e. 2e-6 and 4e-6 (since putting more tickmarks would cause them to overlap).
A smart algorithm would give preference to tickmarks at multiples of 10, 5, and 2. Also, a smart algorithm would be symmetric around zero.
Upvotes: 31
Views: 20884
Reputation: 606
Funnily enough, just over a week ago I came here looking for an answer to the same question, but went away again and decided to come up with my own algorithm. I am here to share, in case it is of any use.
I wrote the code in Python to try and bust out a solution as quickly as possible, but it can easily be ported to any other language.
The function below calculates the appropriate interval (which I have allowed to be either 10**n
, 2*10**n
, 4*10**n
or 5*10**n
) for a given range of data, and then calculates the locations at which to place the ticks (based on which numbers within the range are divisble by the interval). I have not used the modulo %
operator, since it does not work properly with floating-point numbers due to floating-point arithmetic rounding errors.
Code:
import math
def get_tick_positions(data: list):
if len(data) == 0:
return []
retpoints = []
data_range = max(data) - min(data)
lower_bound = min(data) - data_range/10
upper_bound = max(data) + data_range/10
view_range = upper_bound - lower_bound
num = lower_bound
n = math.floor(math.log10(view_range) - 1)
interval = 10**n
num_ticks = 1
while num <= upper_bound:
num += interval
num_ticks += 1
if num_ticks > 10:
if interval == 10 ** n:
interval = 2 * 10 ** n
elif interval == 2 * 10 ** n:
interval = 4 * 10 ** n
elif interval == 4 * 10 ** n:
interval = 5 * 10 ** n
else:
n += 1
interval = 10 ** n
num = lower_bound
num_ticks = 1
if view_range >= 10:
copy_interval = interval
else:
if interval == 10 ** n:
copy_interval = 1
elif interval == 2 * 10 ** n:
copy_interval = 2
elif interval == 4 * 10 ** n:
copy_interval = 4
else:
copy_interval = 5
first_val = 0
prev_val = 0
times = 0
temp_log = math.log10(interval)
if math.isclose(lower_bound, 0):
first_val = 0
elif lower_bound < 0:
if upper_bound < -2*interval:
if n < 0:
copy_ub = round(upper_bound*10**(abs(temp_log) + 1))
times = copy_ub // round(interval*10**(abs(temp_log) + 1)) + 2
else:
times = upper_bound // round(interval) + 2
while first_val >= lower_bound:
prev_val = first_val
first_val = times * copy_interval
if n < 0:
first_val *= (10**n)
times -= 1
first_val = prev_val
times += 3
else:
if lower_bound > 2*interval:
if n < 0:
copy_ub = round(lower_bound*10**(abs(temp_log) + 1))
times = copy_ub // round(interval*10**(abs(temp_log) + 1)) - 2
else:
times = lower_bound // round(interval) - 2
while first_val < lower_bound:
first_val = times*copy_interval
if n < 0:
first_val *= (10**n)
times += 1
if n < 0:
retpoints.append(first_val)
else:
retpoints.append(round(first_val))
val = first_val
times = 1
while val <= upper_bound:
val = first_val + times * interval
if n < 0:
retpoints.append(val)
else:
retpoints.append(round(val))
times += 1
retpoints.pop()
return retpoints
When passing in the following three data-points to the function
points = [-0.00493, -0.0003892, -0.00003292]
... the output I get (as a list) is as follows:
[-0.005, -0.004, -0.003, -0.002, -0.001, 0.0]
When passing this:
points = [1.399, 38.23823, 8309.33, 112990.12]
... I get:
[0, 20000, 40000, 60000, 80000, 100000, 120000]
When passing this:
points = [-54, -32, -19, -17, -13, -11, -8, -4, 12, 15, 68]
... I get:
[-60, -40, -20, 0, 20, 40, 60, 80]
... which all seem to be a decent choice of positions for placing ticks.
The function is written to allow 5-10 ticks, but that could easily be changed if you so please.
Whether the list of data supplied contains ordered or unordered data it does not matter, since it is only the minimum and maximum data points within the list that matter.
Upvotes: 2
Reputation: 44603
I've been using the jQuery flot graph library. It's open source and does axis/tick generation quite well. I'd suggest looking at it's code and pinching some ideas from there.
Upvotes: 1
Reputation: 7880
As I didn't like any of the solutions I've found so far, I implemented my own. It's in C# but it can be easily translated into any other language.
It basically chooses from a list of possible steps the smallest one that displays all values, without leaving any value exactly in the edge, lets you easily select which possible steps you want to use (without having to edit ugly if-else if
blocks), and supports any range of values. I used a C# Tuple
to return three values just for a quick and simple demonstration.
private static Tuple<decimal, decimal, decimal> GetScaleDetails(decimal min, decimal max)
{
// Minimal increment to avoid round extreme values to be on the edge of the chart
decimal epsilon = (max - min) / 1e6m;
max += epsilon;
min -= epsilon;
decimal range = max - min;
// Target number of values to be displayed on the Y axis (it may be less)
int stepCount = 20;
// First approximation
decimal roughStep = range / (stepCount - 1);
// Set best step for the range
decimal[] goodNormalizedSteps = { 1, 1.5m, 2, 2.5m, 5, 7.5m, 10 }; // keep the 10 at the end
// Or use these if you prefer: { 1, 2, 5, 10 };
// Normalize rough step to find the normalized one that fits best
decimal stepPower = (decimal)Math.Pow(10, -Math.Floor(Math.Log10((double)Math.Abs(roughStep))));
var normalizedStep = roughStep * stepPower;
var goodNormalizedStep = goodNormalizedSteps.First(n => n >= normalizedStep);
decimal step = goodNormalizedStep / stepPower;
// Determine the scale limits based on the chosen step.
decimal scaleMax = Math.Ceiling(max / step) * step;
decimal scaleMin = Math.Floor(min / step) * step;
return new Tuple<decimal, decimal, decimal>(scaleMin, scaleMax, step);
}
static void Main()
{
// Dummy code to show a usage example.
var minimumValue = data.Min();
var maximumValue = data.Max();
var results = GetScaleDetails(minimumValue, maximumValue);
chart.YAxis.MinValue = results.Item1;
chart.YAxis.MaxValue = results.Item2;
chart.YAxis.Step = results.Item3;
}
Upvotes: 17
Reputation: 6177
This simple algorithm yields an interval that is multiple of 1, 2, or 5 times a power of 10. And the axis range gets divided in at least 5 intervals. The code sample is in java language:
protected double calculateInterval(double range) {
double x = Math.pow(10.0, Math.floor(Math.log10(range)));
if (range / x >= 5)
return x;
else if (range / (x / 2.0) >= 5)
return x / 2.0;
else
return x / 5.0;
}
This is an alternative, for minimum 10 intervals:
protected double calculateInterval(double range) {
double x = Math.pow(10.0, Math.floor(Math.log10(range)));
if (range / (x / 2.0) >= 10)
return x / 2.0;
else if (range / (x / 5.0) >= 10)
return x / 5.0;
else
return x / 10.0;
}
Upvotes: 1
Reputation: 5246
Take the longest of the segments about zero (or the whole graph, if zero is not in the range) - for example, if you have something on the range [-5, 1], take [-5,0].
Figure out approximately how long this segment will be, in ticks. This is just dividing the length by the width of a tick. So suppose the method says that we can put 11 ticks in from -5 to 0. This is our upper bound. For the shorter side, we'll just mirror the result on the longer side.
Now try to put in as many (up to 11) ticks in, such that the marker for each tick in the form i*10*10^n, i*5*10^n, i*2*10^n, where n is an integer, and i is the index of the tick. Now it's an optimization problem - we want to maximize the number of ticks we can put in, while at the same time minimizing the distance between the last tick and the end of the result. So assign a score for getting as many ticks as we can, less than our upper bound, and assign a score to getting the last tick close to n - you'll have to experiment here.
In the above example, try n = 1. We get 1 tick (at i=0). n = 2 gives us 1 tick, and we're further from the lower bound, so we know that we have to go the other way. n = 0 gives us 6 ticks, at each integer point point. n = -1 gives us 12 ticks (0, -0.5, ..., -5.0). n = -2 gives us 24 ticks, and so on. The scoring algorithm will give them each a score - higher means a better method.
Do this again for the i * 5 * 10^n, and i*2*10^n, and take the one with the best score.
(as an example scoring algorithm, say that the score is the distance to the last tick times the maximum number of ticks minus the number needed. This will likely be bad, but it'll serve as a decent starting point).
Upvotes: 3