Bowecho
Bowecho

Reputation: 909

Find timeline for duration values in Matlab

I have the following time-series:

b = [2 5 110 113 55 115 80 90 120 35 123];

Each number in b is one data point at a time instant. I computed the duration values from b. Duration is represented by all numbers within b larger or equal to 100 and arranged consecutively (all other numbers are discarded). A maximum gap of one number smaller than 100 is allowed. This is how the code for duration looks like:

 N = 2;     % maximum allowed gap     
 duration = cellfun(@numel, regexp(char((b>=100)+'0'), [repmat('0',1,N) '+'],    'split'));

giving the following duration values for b:

duration = [4 3]; 

I want to find the positions (time-lines) within b for each value in duration. Next, I want to replace the other positions located outside duration with zeros. The result would look like this:

result = [0 0 3 4 5 6 0 0 9 10 11]; 

If anyone could help, it would be great.

Upvotes: 0

Views: 48

Answers (2)

Luis Mendo
Luis Mendo

Reputation: 112669

Answer to original question: pattern with at most one value below 100

Here's an approach using a regular expression to detect the desired pattern. I'm assuming that one value <100 is allowed only between (not after) values >=100. So the pattern is: one or more values >=100 with a possible value <100 in between .

b = [2 5 110 113 55 115 80 90 120 35 123]; %// data
B = char((b>=100)+'0'); %// convert to string of '0' and '1'
[s, e] = regexp(B, '1+(.1+|)', 'start', 'end'); %// find pattern
y = 1:numel(B);
c = any(bsxfun(@ge, y, s(:)) & bsxfun(@le, y, e(:))); %// filter by locations of pattern
y = y.*c; %// result

This gives

y =
     0     0     3     4     5     6     0     0     9    10    11

Answer to edited question: pattern with at most n values in a row below 100

The regexp needs to be modified, and it has to be dynamically built as a function of n:

b = [2 5 110 113 55 115 80 90 120 35 123]; %// data
n = 2;
B = char((b>=100)+'0'); %// convert to string of '0' and '1'
r = sprintf('1+(.{1,%i}1+)*', n); %// build the regular expression from n
[s, e] = regexp(B, r, 'start', 'end'); %// find pattern
y = 1:numel(B);
c = any(bsxfun(@ge, y, s(:)) & bsxfun(@le, y, e(:))); %// filter by locations of pattern
y = y.*c; %// result

Upvotes: 1

zeeMonkeez
zeeMonkeez

Reputation: 5157

Here is another solution, not using regexp. It naturally generalizes to arbitrary gap sizes and thresholds. Not sure whether there is a better way to fill the gaps. Explanation in comments:

% maximum step size and threshold
N = 2;
threshold = 100;
% data
b = [2 5 110 113 55 115 80 90 120 35 123];

% find valid data
B = b >= threshold;
B_ind = find(B);
% find lengths of gaps
step_size = diff(B_ind);
% find acceptable steps (and ignore step size 1)
permissible_steps = 1 < step_size & step_size <= N;
% find beginning and end of runs
good_begin = B_ind([permissible_steps, false]);
good_end = good_begin + step_size(permissible_steps);
% fill gaps in B
for ii = 1:numel(good_begin)
    B(good_begin(ii):good_end(ii)) = true;
end
% find durations of runs in B. This finds points where we switch from 0 to
% 1 and vice versa. Due to padding the first match is always a start of a
% run, the last one always an end. There will be an even number of matches,
% so we can reshape and diff and thus fidn the durations
durations = diff(reshape(find(diff([false, B, false])), 2, []));

% get positions of 'good' data
outpos = zeros(size(b));
outpos(B) = find(B);

Upvotes: 0

Related Questions