Reputation: 3551
At a point in time, I need to find all Targets that have points earlier and later than that time.
Currently I am doing the following:
for current_time = sorted_set_of_times;
target_set = find([Targets.First_Time] <= current_time ...
& [Targets.Last_Time] >= current_time);
% Using target_set here. Targets are not modified within this loop
end
MATLAB profiler is telling me that this line takes about 40 minutes with 40,000 calls.
Is there a way that I can do this more efficiently?
current_time
is incremented in a for_loop in order. I was thinking that the fact that this is always increasing could be used to shortcut some of the testing. For example Targets.First_Time <= current_time
will be a subset of Targets.First_Time <= current_time+n
.
numel(sorted_set_of_times)
is approximately 40,000
numel(Targets)
is more than 10 million
Upvotes: 0
Views: 120
Reputation: 21563
I think that a brute force approach will never get much faster with these numbers. (Not by taking out structs or using logical indexing).
If you want a fast algorithm, I think the following can work:
current_time
(already done)First_Time
S
Last_Time
E
If you have two sorted lists you just need to loop through each once to find the relevant point, significantly reducing the complexity of the operation.
Now you can just call your function for each target and use current_time(S:E)
Upvotes: 1
Reputation: 221614
Approach #1
Vectorized solution with bsxfun
-
eqcond = bsxfun(@le,Targets.First_Time,sorted_set_of_times) & bsxfun(@ge,Targets.Last_Time,sorted_set_of_times);
[r,c] = find(eqcond);
for k =1:numel(sorted_set_of_times)
target_set = r(c==k);
end
Suggestion to use logical indexing (if applicable): If you are using target_set
to index into one of the dimensions of some variable named target
, which is of length same as Targets.First_Time
for that dimension, then instead of calculating target_set
you can directly index into that dimension of target
using eqcond(:,ind)
, where ind
would be the index to sorted_set_of_times
in that loop. Thus, as an example, if you were indexing into the rows of target
using target_set
in your original code like target(target_set,..)
, then you can do target(eqcond(:,ind),..)
instead.
Approach #2
Reduced data approach -
vind = find(Targets.First_Time <= Targets.Last_Time); %//Indices to reduce Targets datasize
Targets.First_Time = Targets.First_Time(vind);
Targets.Last_Time = Targets.Last_Time(vind);
for current_time = sorted_set_of_times;
target_set = vind([Targets.First_Time] <= current_time & [Targets.Last_Time] >= current_time);
end
Approach #3 (Hybrid of Approaches #1,2)
vind = find(Targets.First_Time <= Targets.Last_Time);
Targets.First_Time = Targets.First_Time(vind);
Targets.Last_Time = Targets.Last_Time(vind);
eqcond = bsxfun(@le,Targets.First_Time,sorted_set_of_times) & bsxfun(@ge,Targets.Last_Time,sorted_set_of_times);
[r,c] = find(eqcond);
for k =1:numel(sorted_set_of_times)
target_set = vind(r(c==k));
end
Suggestion to use logical indexing (if applicable): On a similar discussion as stated for approach #1, you could perform logical indexing for this one too. Thus, using the same notations as for that discussion, instead of doing target(target_set,..)
, you can do target(vind(eqcond(:,ind)),..)
.
Upvotes: 1
Reputation: 478
I was just looking at the problem description, and I think the use of an Interval Tree or KD Tree might be a better choice of data structure to the type of query you are doing.( Search based on ranges).
As in, let's say you build an interval tree with (startTime, EndTime), then you can execute the query on the ranges. Here's a very good thread on Stackoverflow about such range querying data structures
I don't know matlab, but it does seem to have support for KD-TREES. I couldn't find information about Interval trees in matlab, but here's a good tutorial on the same.
Upvotes: 1
Reputation: 1419
I don't know specific MATLAB instructions, but you can try following, trading memory for CPU:
min_time >= TARGET.First_Time
and max_time <= TARGET.Last_Time
- using binary search.Upvotes: 1