ahmet
ahmet

Reputation: 1125

Find missing date ranges in timeline using java

I have a custom java sync that fetch data by date range thoght SOAP service running on tomcat. Ex:

getDataByDateRange(startDate,endDate)
getDataByDateRange('2016-01-01 10:00:00.00000','2016-01-01 11:00:00.00000')

I want to write a control program to check if any range has been missed by any kind of runtime or server error.

How can I find the missing date ranges?

Thanks.

Visually Example:

TimeLine       : [------------------------------------------------------------------]
Processed Dates: [----1---][---2----]---[-3-][--4---]---[----5---][---6--]-----------
Missing Dates  : -------------------[-1-]-----------[-2-]----------------[-----3----]

TimeLine: 
1: '2016-01-01 10:00:00.00000','2016-02-01 09:00:00.00000'

Processed Dates: 
1: '2016-01-01 10:00:00.00000','2016-01-01 11:00:00.00000'
2: '2016-01-01 11:00:00.00000','2016-01-01 12:00:00.00000'
3: '2016-01-01 13:00:00.00000','2016-01-01 13:30:00.00000'
4: '2016-01-01 13:30:00.00000','2016-01-01 14:30:00.00000'
5: '2016-01-01 15:30:00.00000','2016-01-01 16:30:00.00000'
6: '2016-01-01 16:30:00.00000','2016-01-01 17:00:00.00000'

Missing Dates:
1: '2016-01-01 12:00:00.00000','2016-01-01 13:00:00.00000'
2: '2016-01-01 14:30:00.00000','2016-01-01 15:30:00.00000'
3: '2016-01-01 17:00:00.00000','2016-01-02 09:00:00.00000'

Upvotes: 3

Views: 2372

Answers (3)

Meno Hochschild
Meno Hochschild

Reputation: 44061

According to your comment I post my previous comment as answer. This solution uses my library Time4J (including the range-module):

// prepare parser
ChronoFormatter<PlainTimestamp> f = 
  ChronoFormatter.ofTimestampPattern( // five decimal digits
    "uuuu-MM-dd HH:mm:ss.SSSSS", PatternType.CLDR, Locale.ROOT);

// parse input to intervals - here the overall time window
TimestampInterval timeline = 
  TimestampInterval.between(
    f.parse("2016-01-01 10:00:00.00000"), 
    f.parse("2016-02-01 09:00:00.00000"));

// for more flexibility - consider a for-each-loop
TimestampInterval i1 =
    TimestampInterval.between(f.parse("2016-01-01 10:00:00.00000"), f.parse("2016-01-01 11:00:00.00000"));
TimestampInterval i2 =
    TimestampInterval.between(f.parse("2016-01-01 11:00:00.00000"), f.parse("2016-01-01 12:00:00.00000"));
TimestampInterval i3 =
    TimestampInterval.between(f.parse("2016-01-01 13:00:00.00000"), f.parse("2016-01-01 13:30:00.00000"));
TimestampInterval i4 =
    TimestampInterval.between(f.parse("2016-01-01 13:30:00.00000"), f.parse("2016-01-01 14:30:00.00000"));
TimestampInterval i5 =
    TimestampInterval.between(f.parse("2016-01-01 15:30:00.00000"), f.parse("2016-01-01 16:30:00.00000"));
TimestampInterval i6 =
    TimestampInterval.between(f.parse("2016-01-01 16:30:00.00000"), f.parse("2016-01-01 17:00:00.00000"));

// apply interval arithmetic
IntervalCollection<PlainTimestamp> icoll =
    IntervalCollection.onTimestampAxis().plus(Arrays.asList(i1, i2, i3, i4, i5, i6));
List<ChronoInterval<PlainTimestamp>> missed = icoll.withComplement(timeline).getIntervals();

// result
System.out.println(missed); 
// [[2016-01-01T12/2016-01-01T13), [2016-01-01T14:30/2016-01-01T15:30), [2016-01-01T17/2016-02-01T09)]

The core of the whole interval arithmetic is just done by the code fragment icoll.withComplement(timeline). The rest is only about creation of intervals. By applying a for-each-loop you can surely minimize again the count of lines in presented code.

The output is based on the canonical description of the intervals implicitly using toString(), for example: [2016-01-01T12/2016-01-01T13) The square bracket denotes a closed boundary while the round bracket to the right end denotes an open boundary. So we have here the standard case of half-open timestamp intervals (without timezone). While other interval types are possible I have chosen that type because it corresponds to the type of your input strings.

If you plan to combine this solution with Joda-Time in other parts of your app then keep in mind that a) there is not yet any special bridge between both libraries available and b) the conversion looses microsecond precision (Joda-Time only supports milliseconds) and c) Time4J has much more power than Joda-Time (for almost everything). Anyway, you can do this as conversion (important if you don't want to do the effort of bigger rewriting of your app):

ChronoInterval<PlainTimestamp> missed0 = missed.get(0);
PlainTimestamp tsp = missed0.getStart().getTemporal();
LocalDateTime ldt = // joda-equivalent
    new LocalDateTime(
        tsp.getYear(), tsp.getMonth(), tsp.getDayOfMonth(), 
        tsp.getHour(), tsp.getMinute(), tsp.getSecond(), tsp.get(PlainTime.MILLI_OF_SECOND));
System.out.println(ldt); // 2016-01-01T10:00:00.000

About a Joda-only solution:

Joda-Time does only support instant intervals, not timestamp intervals without timezone. However, you could simulate that missing interval type by hardwiring the timezone to UTC (using fixed offset).

Another problem is missing support for five decimal digits. You can circumvent it by this hack:

DateTime start = 
  DateTimeFormat.forPattern("yyyy-MM-dd HH:mm:ss.SSS")
    .withZoneUTC()
    .parseDateTime("2016-01-01 16:30:00.00000".substring(0, 23));
System.out.println(start); // 2016-01-01T16:30:00.000Z
DateTime end = ...;
Interval interval = new Interval(start, end);

The other more critical element of a solution is almost missing - interval arithmetic. You have to sort the intervals first by start instant (and then by end instant). After sorting, you can iterate over all intervals such that you find the gaps. The best thing Joda-Time can do for you here is giving you methods like isBefore(anotherInstant) etc. which you can use in your own solution. But it gets pretty much bloated.

Upvotes: 3

OldCurmudgeon
OldCurmudgeon

Reputation: 65811

I posted my IntervalTree a while ago - it seems to work well with this kind of problem.

See the minimise method for what you are looking for.

/**
 * Title: IntervlTree
 *
 * Description: Implements a static Interval Tree. i.e. adding and removal are not possible.
 *
 * This implementation uses longs to bound the intervals but could just as easily use doubles or any other linear value.
 *
 * @author OldCurmudgeon
 * @version 1.0
 * @param <T> - The Intervals to work with.
 */
public class IntervalTree<T extends IntervalTree.Interval> {
    // My intervals.

    private final List<T> intervals;
    // My center value. All my intervals contain this center.
    private final long center;
    // My interval range.
    private final long lBound;
    private final long uBound;
    // My left tree. All intervals that end below my center.
    private final IntervalTree<T> left;
    // My right tree. All intervals that start above my center.
    private final IntervalTree<T> right;

    public IntervalTree(List<T> intervals) {
        if (intervals == null) {
            throw new NullPointerException();
        }

        // Initially, my root contains all intervals.
        this.intervals = intervals;

        // Find my center.
        center = findCenter();

        /*
         * Builds lefts out of all intervals that end below my center.
         * Builds rights out of all intervals that start above my center.
         * What remains contains all the intervals that contain my center.
         */
        // Lefts contains all intervals that end below my center point.
        final List<T> lefts = new ArrayList<>();
        // Rights contains all intervals that start above my center point.
        final List<T> rights = new ArrayList<>();

        long uB = Long.MIN_VALUE;
        long lB = Long.MAX_VALUE;
        for (T i : intervals) {
            long start = i.getStart();
            long end = i.getEnd();
            if (end < center) {
                lefts.add(i);
            } else if (start > center) {
                rights.add(i);
            } else {
                // One of mine.
                lB = Math.min(lB, start);
                uB = Math.max(uB, end);
            }
        }

        // Remove all those not mine.
        intervals.removeAll(lefts);
        intervals.removeAll(rights);
        uBound = uB;
        lBound = lB;

        // Build the subtrees.
        left = lefts.size() > 0 ? new IntervalTree<>(lefts) : null;
        right = rights.size() > 0 ? new IntervalTree<>(rights) : null;

        // Build my ascending and descending arrays.
        /**
         * @todo Build my ascending and descending arrays.
         */
    }

    /*
     * Returns a list of all intervals containing the point.
     */
    List<T> query(long point) {
        // Check my range.
        if (point >= lBound) {
            if (point <= uBound) {
                // Gather all intersecting ones.
                List<T> found = intervals
                        .stream()
                        .filter((i) -> (i.getStart() <= point && point <= i.getEnd()))
                        .collect(Collectors.toList());

                // Gather others.
                if (point < center && left != null) {
                    found.addAll(left.query(point));
                }
                if (point > center && right != null) {
                    found.addAll(right.query(point));
                }

                return found;
            } else {
                // To right.
                return right != null ? right.query(point) : Collections.<T>emptyList();
            }
        } else {
            // To left.
            return left != null ? left.query(point) : Collections.<T>emptyList();
        }

    }

    /**
     * Blends the two lists together.
     *
     * If the ends touch then make them one.
     *
     * @param a
     * @param b
     * @return
     */
    static List<Interval> blend(List<Interval> a, List<Interval> b) {
        // Either empty - return the other.
        if (a.isEmpty()) {
            return b;
        }
        if (b.isEmpty()) {
            return a;
        }
        // Where does a end and b start.
        Interval aEnd = a.get(a.size() - 1);
        Interval bStart = b.get(0);
        ArrayList<Interval> blended = new ArrayList<>();
        // Do they meet/cross?
        if (aEnd.getEnd() >= bStart.getStart() - 1) {
            // Yes! merge them.
            // Remove the last.
            blended.addAll(a.subList(0, a.size() - 1));
            // Add a combined one.
            blended.add(new SimpleInterval(aEnd.getStart(), bStart.getEnd()));
            // Add all but the first.
            blended.addAll(b.subList(1, b.size()));
        } else {
            // Just join them.
            blended.addAll(a);
            blended.addAll(b);
        }
        return blended;
    }

    static List<Interval> blend(List<Interval> a, List<Interval> b, List<Interval>... more) {
        List<Interval> blended = blend(a, b);
        for (List<Interval> l : more) {
            blended = blend(blended, l);
        }
        return blended;
    }

    List<Interval> minimise() {
        // Calculate min of left and right.
        List<Interval> minLeft = left != null ? left.minimise() : Collections.EMPTY_LIST;
        List<Interval> minRight = right != null ? right.minimise() : Collections.EMPTY_LIST;
        // My contribution.
        long meLeft = minLeft.isEmpty() ? lBound : Math.max(lBound, minLeft.get(minLeft.size() - 1).getEnd());
        long meRight = minRight.isEmpty() ? uBound : Math.min(uBound, minRight.get(0).getEnd());
        return blend(minLeft,
                Collections.singletonList(new SimpleInterval(meLeft, meRight)),
                minRight);
    }

    private long findCenter() {
        //return average();
        return median();
    }

    protected long median() {
        if (intervals.isEmpty()) {
            return 0;
        }
        // Choose the median of all centers. Could choose just ends etc or anything.
        long[] points = new long[intervals.size()];
        int x = 0;
        for (T i : intervals) {
            // Take the mid point.
            points[x++] = (i.getStart() + i.getEnd()) / 2;
        }
        Arrays.sort(points);
        return points[points.length / 2];
    }

    /*
     * What an interval looks like.
     */
    public interface Interval {

        public long getStart();

        public long getEnd();
    }

    /*
     * A simple implemementation of an interval.
     */
    public static class SimpleInterval implements Interval {

        private final long start;
        private final long end;

        public SimpleInterval(long start, long end) {
            this.start = start;
            this.end = end;
        }

        @Override
        public long getStart() {
            return start;
        }

        @Override
        public long getEnd() {
            return end;
        }

        @Override
        public String toString() {
            return "{" + start + "," + end + "}";
        }
    }

    /**
     * Not called by App, so you will have to call this directly.
     *
     * @param args
     */
    public static void main(String[] args) {
        /**
         * @todo Needs MUCH more rigorous testing.
         */
        // Test data.
        long[][] data = {
            {1, 4}, {2, 5}, {5, 7}, {10, 11}, {13, 20}, {19, 21},};
        List<Interval> intervals = new ArrayList<>();
        for (long[] pair : data) {
            intervals.add(new SimpleInterval(pair[0], pair[1]));
        }
        // Build it.
        IntervalTree<Interval> test = new IntervalTree<>(intervals);

        // Test it.
        System.out.println("Normal test: ---");
        for (long i = 0; i < 10; i++) {
            List<Interval> intersects = test.query(i);
            System.out.println("Point " + i + " intersects:");
            intersects.stream().forEach((t) -> {
                System.out.println(t.toString());
            });
        }

        // Check minimise.
        List<Interval> min = test.minimise();
        System.out.println("Minimise test: ---");
        System.out.println(min);

        // Check for empty list.
        intervals.clear();
        test = new IntervalTree<>(intervals);
        // Test it.
        System.out.println("Empty test: ---");
        for (long i = 0; i < 10; i++) {
            List<Interval> intersects = test.query(i);
            System.out.println("Point " + i + " intersects:");
            intersects.stream().forEach((t) -> {
                System.out.println(t.toString());
            });
        }

    }
}

This gets close to what you are looking for. Here's some code to minimise your ranges into just 3.

static String[][] dates = {{"2016-01-01 10:00:00.00000", "2016-01-01 11:00:00.00000"}, {"2016-01-01 11:00:00.00000", "2016-01-01 12:00:00.00000"}, {"2016-01-01 13:00:00.00000", "2016-01-01 13:30:00.00000"}, {"2016-01-01 13:30:00.00000", "2016-01-01 14:30:00.00000"}, {"2016-01-01 15:30:00.00000", "2016-01-01 16:30:00.00000"}, {"2016-01-01 16:30:00.00000", "2016-01-01 17:00:00.00000"}};
static List<IntervalTree.SimpleInterval> ranges = new ArrayList<>();
static final DateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.S");

static {
    for (String[] pair : dates) {
        try {
            ranges.add(new IntervalTree.SimpleInterval(df.parse(pair[0]).getTime(), df.parse(pair[1]).getTime()));
        } catch (ParseException ex) {
            Logger.getLogger(Test.class.getName()).log(Level.SEVERE, null, ex);
        }
    }
}

public void test() {
    IntervalTree tree = new IntervalTree<>(ranges);
    List<IntervalTree.Interval> min = tree.minimise();
    //System.out.println("min->" + min);
    for (IntervalTree.Interval i : min) {
        System.out.println(df.format(new Date(i.getStart())) + " - " + df.format(new Date(i.getEnd())));
    }
}

which prints

2016-01-01 10:00:00.0 - 2016-01-01 12:00:00.0
2016-01-01 13:00:00.0 - 2016-01-01 14:30:00.0
2016-01-01 15:30:00.0 - 2016-01-01 17:00:00.0

which is all of your Processed Dates joined into three date ranges.

Upvotes: 1

Darshan Mehta
Darshan Mehta

Reputation: 30819

Given that the frequency of date ranges is one hour, you can start with range start date, iterate till range end date and write a method that checks for an entry with dates. You can use DateUtils to add hour to date, as shown in the below pseudo code:

Date startDate  = startDate;
Date endDate = endDate;
while (startDate.before(endDate){
    if(!exists(startDate, DateUtils.addHours(startDate, 1), entries)){
        //Add into missing entries
    }
    startDate = DateUtils.addHours(startDate, 1);
}

Upvotes: 1

Related Questions