Reputation: 6222
The Problem:
I have a list of data with time and value (time = long milliSec and double value). I now need to calculate several averages within different time ranges.
I get up to 50 values per second but sometimes only a few values and need to hold the last 10 seconds, so 500 values.
What I want is: calculate average of values where time >= begin and time <= end.
I can ensure no time is double, so it could be used as a key.
Currently I use an array to store the values and have a position marker which gets reset to 0 once 500 is reached, so old entries are recyled. I can change that easily.
I was not sure what the fastest approach would be, e.g. manual search of array or using a list, hashMap, Collection (with comparator?) or else. I could not find a (java) list-like function where I hava a build-in search either "key >= x" or "value >=x".
Performance is more important than nice or easy coding.
Would be nice to be pointed in the right direction.
I compute the average of the last 10 values each time a new value comes in, that is 30-50 calculations per second alone and is the most important data. I need to distinguish small errors in measurement from actual changes. I additional calculate the average of each 1/10th of a second (this may be dropped) and finally the average of a second and the average of last 10 seconds. That are additional 12 average calculation per second. Reducing the number of calculations is not really an option.
As this is a bit abstract, here is an example of what the data looks like (where avg is calculated of last 10 values, but that is not the program logic).
value Avg timeReading timeReadingISO
1024,6668701172 - 1385408750828 2013-11-25 19:45:50
1024,6668701172 - 1385408751350 2013-11-25 19:45:51
1024,6668701172 - 1385408751859 2013-11-25 19:45:51
1024,6683349609 - 1385408752373 2013-11-25 19:45:52
1024,6683349609 - 1385408752878 2013-11-25 19:45:52
1024,6689453125 - 1385408753385 2013-11-25 19:45:53
1024,6689453125 - 1385408753895 2013-11-25 19:45:53
1024,6721191406 - 1385408754406 2013-11-25 19:45:54
1024,6721191406 - 1385408754912 2013-11-25 19:45:54
1024,6774902344 - 1385408755432 2013-11-25 19:45:55
1024,6774902344 1024,67 1385408755994 2013-11-25 19:45:55
1024,6774902344 1024,67 1385408756502 2013-11-25 19:45:56
1024,6837158203 1024,67 1385408757012 2013-11-25 19:45:57
1024,6837158203 1024,67 1385408757520 2013-11-25 19:45:57
1024,689453125 1024,68 1385408758028 2013-11-25 19:45:58
1024,689453125 1024,68 1385408758536 2013-11-25 19:45:58
1024,6938476563 1024,68 1385408759055 2013-11-25 19:45:59
1024,6938476563 1024,68 1385408759560 2013-11-25 19:45:59
1024,6990966797 1024,68 1385408760075 2013-11-25 19:46:00
1024,6990966797 1024,69 1385408760579 2013-11-25 19:46:00
1024,7038574219 1024,69 1385408761086 2013-11-25 19:46:01
1024,7038574219 1024,69 1385408761596 2013-11-25 19:46:01
1024,7111816406 1024,69 1385408762103 2013-11-25 19:46:02
1024,7111816406 1024,70 1385408762606 2013-11-25 19:46:02
1024,7111816406 1024,70 1385408763112 2013-11-25 19:46:03
1024,7111816406 1024,70 1385408763622 2013-11-25 19:46:03
1024,7172851563 1024,70 1385408764128 2013-11-25 19:46:04
1024,7172851563 1024,71 1385408764637 2013-11-25 19:46:04
1024,7208251953 1024,71 1385408765149 2013-11-25 19:46:05
1026,5457763672 - 1385474621756 2013-11-26 14:03:41
1026,6057128906 - 1385474621790 2013-11-26 14:03:41
1026,6257324219 - 1385474621823 2013-11-26 14:03:41
1026,6057128906 - 1385474621858 2013-11-26 14:03:41
1026,6257324219 - 1385474621890 2013-11-26 14:03:41
1026,6257324219 - 1385474621921 2013-11-26 14:03:41
1026,6057128906 - 1385474621956 2013-11-26 14:03:41
1026,5457763672 - 1385474621988 2013-11-26 14:03:41
1026,6557617188 - 1385474622022 2013-11-26 14:03:42
1026,6657714844 - 1385474622057 2013-11-26 14:03:42
1026,6257324219 1026,61 1385474622090 2013-11-26 14:03:42
1026,6057128906 1026,62 1385474622123 2013-11-26 14:03:42
1026,6657714844 1026,62 1385474622159 2013-11-26 14:03:42
1026,6557617188 1026,62 1385474622193 2013-11-26 14:03:42
1026,6557617188 1026,63 1385474622227 2013-11-26 14:03:42
1026,6257324219 1026,63 1385474622260 2013-11-26 14:03:42
1026,6257324219 1026,63 1385474622298 2013-11-26 14:03:42
1026,6557617188 1026,63 1385474622330 2013-11-26 14:03:42
1026,6257324219 1026,64 1385474622365 2013-11-26 14:03:42
1026,6257324219 1026,64 1385474622401 2013-11-26 14:03:42
1026,6257324219 1026,64 1385474622431 2013-11-26 14:03:42
1026,5758056641 1026,64 1385474622466 2013-11-26 14:03:42
1026,6057128906 1026,63 1385474622501 2013-11-26 14:03:42
1026,5457763672 1026,63 1385474622533 2013-11-26 14:03:42
1026,5457763672 1026,62 1385474622565 2013-11-26 14:03:42
1026,6057128906 1026,61 1385474622599 2013-11-26 14:03:42
1026,6057128906 1026,60 1385474622631 2013-11-26 14:03:42
1026,5758056641 1026,60 1385474622665 2013-11-26 14:03:42
1026,5457763672 1026,59 1385474622702 2013-11-26 14:03:42
1026,6057128906 1026,59 1385474622734 2013-11-26 14:03:42
1026,6557617188 1026,58 1385474622766 2013-11-26 14:03:42
1026,5758056641 1026,59 1385474622800 2013-11-26 14:03:42
1026,6057128906 1026,59 1385474622836 2013-11-26 14:03:42
1026,6057128906 1026,59 1385474622868 2013-11-26 14:03:42
1026,5158691406 1026,59 1385474622901 2013-11-26 14:03:42
1026,5457763672 1026,59 1385474622935 2013-11-26 14:03:42
1026,6856689453 1026,58 1385474622966 2013-11-26 14:03:42
Upvotes: 5
Views: 1600
Reputation: 6017
Caching groups of averages in Maciej's answer is an efficient approach. A simple approach for your current lists would be Java's SortedSet, which is an interface so you'll end up using a TreeSet.
Create a Comparable
object to store your time and value, or create a Comparator
for the SortedSet. Make sure you're sorted based on the time (not the value).
public class Holder implements Comparable
{
private double time, value;
public Holder (double t, double v)
{
this.time = t;
this.value = v;
}
public double getValue()
{ return this.value; }
public double getTime()
{ return this.time; }
//You may want a better comparator.
public int compareTo( Holder h )
{
return this.getTime < h.getTime() ? -1 : 1;
}
}
Just add your values as normal for a Collection, they will be automatically sorted based on time. When you want an average of the last 10s, find the current time and call sortedSet.tailSet( new CustomObject( currentTime - 10000 ) )
. Now iterate over the returned collection to compute your average.
SortedSet<Holder> slice = allHolders.tailset( new Holder( currentTime - 10000 ) );
double sum = 0.0;
for( Holder h : slice )
{
sum += h.getValue();
}
double result = sum / slice.size();
You can find groups of time with .subSet()
if you feel the averaging call has latency.
Upvotes: 0
Reputation: 12130
First of all, when calculating average you should create a copy of the structure (or use one that is threadsafe and traversing it during addition or removal would not cause pain) unless you do everything in one thread.
I guess that elements in your collection are aready sorted since you sequentially receive updates (if not look for equivalent of sorted Lists).
My approach would be to select the smallest interval of your average measurement. Let's say 10 values. Then you can create 50 collections (of size 10) where every one of them were of class which delivers you method to calculate average. Then you can select which average you want to count. Just count average of collections averages sum. What is more - once calculated average for given collection won't change so you can cache it
Note that you don't have to transfer value from one collection to another since your minimal interval is already handled. If new 10 elements come to the buffer you just can just reassign references.
/* initializing */
MySlicedCollection buffer = new MySlicedCollection();
MySlicedCollection[] mscArray = new MySlicedCollection[50];
/* when every 10 values came in */
for(int i = mscArray.length-1; i > 0 ; --i) {
mscArray[i] = mscArray[i-1];
}
mscArray[0] = buffer;
buffer = new MySlicedCollection();
/* avg of all collection */
for(MySlicedCollection msc : mscArray) {
sum += msc.getAverage();
}
sum /= 50;
You should also think of counting averages using previous results. If you have to count avg for 1sec and 2sec then you can just add remaining average to already counted avg for one second and divide it by 2.
/* avg for one second */
for(int i = 0; i < 5; ++i) {
sumOneSec += mscArray[i].getAverage();
}
sumOneSec /= 5;
/* avg for two seconds */
for(int i = 5; i < 10; ++i) {
sumTwoSec += mscArray[i].getAverage();
}
sumTwoSec = ((sumTwoSec/5) + sumOneSec) / 2;
But remember what someone said: "First measure then act" - maybe your performance is sufficent?
avg = (avg * 50 - oldestValue + newValue)/50;
unfortunately this will introduce a little mistake to your computation due to floating point variables finite representation, but since you're using double values I think you do not need such precision. Similar solution can provided to another averages but this will require more thinking :)
Upvotes: 1