Samuel O'Malley
Samuel O'Malley

Reputation: 3551

Finding a set of targets that overlap a point in time

At a point in time, I need to find all Targets that have points earlier and later than that time.

Currently I am doing the following:

for current_time = sorted_set_of_times;
    target_set = find([Targets.First_Time] <= current_time ... 
        & [Targets.Last_Time] >= current_time);
    % Using target_set here. Targets are not modified within this loop
end

MATLAB profiler is telling me that this line takes about 40 minutes with 40,000 calls.

Is there a way that I can do this more efficiently?

current_time is incremented in a for_loop in order. I was thinking that the fact that this is always increasing could be used to shortcut some of the testing. For example Targets.First_Time <= current_time will be a subset of Targets.First_Time <= current_time+n.

numel(sorted_set_of_times) is approximately 40,000

numel(Targets) is more than 10 million

Upvotes: 0

Views: 120

Answers (4)

Dennis Jaheruddin
Dennis Jaheruddin

Reputation: 21563

I think that a brute force approach will never get much faster with these numbers. (Not by taking out structs or using logical indexing).

If you want a fast algorithm, I think the following can work:

  1. Sort current_time (already done)
  2. Sort target by First_Time
  3. Find for all targets the starting point S
  4. Sort by Last_Time
  5. Find for all targets the endpoint E

If you have two sorted lists you just need to loop through each once to find the relevant point, significantly reducing the complexity of the operation.

Now you can just call your function for each target and use current_time(S:E)

Upvotes: 1

Divakar
Divakar

Reputation: 221614

Approach #1

Vectorized solution with bsxfun -

eqcond = bsxfun(@le,Targets.First_Time,sorted_set_of_times) & bsxfun(@ge,Targets.Last_Time,sorted_set_of_times);
[r,c] = find(eqcond);
for k =1:numel(sorted_set_of_times)
    target_set = r(c==k);
end

Suggestion to use logical indexing (if applicable): If you are using target_set to index into one of the dimensions of some variable named target, which is of length same as Targets.First_Time for that dimension, then instead of calculating target_set you can directly index into that dimension of target using eqcond(:,ind), where ind would be the index to sorted_set_of_times in that loop. Thus, as an example, if you were indexing into the rows of target using target_set in your original code like target(target_set,..), then you can do target(eqcond(:,ind),..) instead.

Approach #2

Reduced data approach -

vind = find(Targets.First_Time <= Targets.Last_Time); %//Indices to reduce Targets datasize
Targets.First_Time = Targets.First_Time(vind);
Targets.Last_Time = Targets.Last_Time(vind);
for current_time = sorted_set_of_times;
    target_set = vind([Targets.First_Time] <= current_time & [Targets.Last_Time] >= current_time);
end

Approach #3 (Hybrid of Approaches #1,2)

vind = find(Targets.First_Time <= Targets.Last_Time);
Targets.First_Time = Targets.First_Time(vind);
Targets.Last_Time = Targets.Last_Time(vind);
eqcond = bsxfun(@le,Targets.First_Time,sorted_set_of_times) & bsxfun(@ge,Targets.Last_Time,sorted_set_of_times);
[r,c] = find(eqcond);
for k =1:numel(sorted_set_of_times)
    target_set = vind(r(c==k));
end

Suggestion to use logical indexing (if applicable): On a similar discussion as stated for approach #1, you could perform logical indexing for this one too. Thus, using the same notations as for that discussion, instead of doing target(target_set,..), you can do target(vind(eqcond(:,ind)),..).

Upvotes: 1

Arvind
Arvind

Reputation: 478

I was just looking at the problem description, and I think the use of an Interval Tree or KD Tree might be a better choice of data structure to the type of query you are doing.( Search based on ranges).

As in, let's say you build an interval tree with (startTime, EndTime), then you can execute the query on the ranges. Here's a very good thread on Stackoverflow about such range querying data structures

I don't know matlab, but it does seem to have support for KD-TREES. I couldn't find information about Interval trees in matlab, but here's a good tutorial on the same.

Upvotes: 1

kilotaras
kilotaras

Reputation: 1419

I don't know specific MATLAB instructions, but you can try following, trading memory for CPU:

  1. Iterate over all targets as TARGET.
  2. Find min_time >= TARGET.First_Time and max_time <= TARGET.Last_Time - using binary search.
  3. Add TARGET to target_set for each time between min_time and max_time.
  4. Iterate times as before, but use precomputed target_sets.

Upvotes: 1

Related Questions