Reputation: 5569

Identify gaps in repeated sequences

I have a vector that should contain n sequences from 00 to 11

A = [00;01;02;03;04;05;06;07;08;09;10;11;00;01;02;03;04;05;06;07;08;09;10;11]

and I would like to check that the sequence "00 - 11 " is always respected (no missing values).

for example if

A =[00;01;02;  04;05;06;07;08;09;10;11;00;01;02;03;04;05;06;07;08;09;10;11]

(missing 03 in the 3rd position) For each missing value I would like to have back this information in another vector

missing=
 [value_1,position_1;
 value_2, position_2;
 etc, etc]

Can you help me?

Upvotes: 2

Answers (3)

chappjc

Reputation: 30589

This will give you the missing values and their positions in the full sequence:

N = 11; % specify the repeating 0:N sub-sequence
n = 3; % reps of sub-sequence
A = [5 6 7 8 9 10 3 4 5 6 7 8 9 10 11 0 1 2 3 4 6 7 8]'; %' column from s.bandara

da = diff([A; N+1]); % EDITED to include missing end
skipLocs = find(~(da==1 | da==-N));
skipLength = da(skipLocs)-1;
skipLength(skipLength<0) = N + skipLength(skipLength<0) + 1;
firstSkipVal = A(skipLocs)+1;

patchFun = @(x,y)(0:y)'+x - (N+1)*(((0:y)'+x)>N);
patches = arrayfun(patchFun,firstSkipVal,skipLength-1,'uni',false);
locs = arrayfun(@(x,y)(x:x+y)',skipLocs+cumsum([A(1); skipLength(1:end-1)])+1,...
    skipLength-1,'uni',false);

Then putting them together, including any missing values at the beginning:

>> gapMap = [vertcat(patches{:}) vertcat(locs{:})-1]; % not including lead
>> gapMap = [repmat((0 : A(1) - 1)',1,2); gapMap] %' including lead
gapMap =
     0     0
     1     1
     2     2
     3     3
     4     4
    11    11
     0    12
     1    13
     2    14
     5    29
     9    33
    10    34
    11    35

The first column contains the missing values. The second column is the 0-based location in the hypothetical full sequence.

>> Afull = repmat(0:N,1,n)
>> isequal(gapMap(:,1), Afull(gapMap(:,2)+1)')
ans =
     1

Upvotes: 1

s.bandara

Reputation: 5664

For sure we know that the last element must be 11, so we can already check for this and make our life easier for testing all previous elements. We ensure that A is 11-terminated, so an "element-wise change" approach (below) will be valid. Note that the same is true for the beginning, but changing A there would mess with indices, so we better take care of that later.

missing = [];
if A(end) ~= 11
    missing = [missing; 11, length(A) + 1];
    A = [A, 11];
end

Then we can calculate the change dA = A(2:end) - A(1:end-1); from one element to another, and identify the gap positions idx_gap = find((dA~=1) & (dA~=-11));. Now we need to expand all missing indices and expected values, using ev for the expected value. ev can be obtained from the previous value, as in

for k = 1 : length(idx_gap)
    ev = A(idx_gap(k));

Now, the number of elements to fill in is the change dA in that position minus one (because one means no gap). Note that this can wrap over if there is a gap at the boundary between segments, so we use the modulus.

    for n = 1 : mod(dA(idx_gap(k)) - 1, 12)
        ev = mod(ev + 1, 12);
        missing = [missing; ev, idx_gap(k) + 1];
    end
end

As a test, consider A = [5 6 7 8 9 10 3 4 5 6 7 8 9 10 11 0 1 2 3 4 6 7 8]. That's a case where the special initialization from the beginning will fire, memorizing the missing 11 already, and changing A to [5 6 ... 7 8 11]. missing then will yield

11    24    % recognizes improper termination of A.
11     7
 0     7    % properly handles wrap-over here.
 1     7
 2     7
 5    21    % recognizes single element as missing.
 9    24
10    24

which should be what you are expecting. Now what's missing still is the beginning of A, so let's say missing = [0 : A(1) - 1, 1; missing]; to complete the list.

Upvotes: 2

Luis Mendo

Reputation: 112749

Although this doesn't solve your problem completely, you can identify the position of missing values, or of groups of contiguous missing values, like this:

ind = 1+find(~ismember(diff(A),[1 -11]));

ind gives the position with respect to the current sequence A, not to the completed sequence.

For example, with

A =[00;01;02;  04;05;06;07;08;09;10;11;00;01;02;03;    ;06;07;08;09;10;11];

this gives

>> ind = 1+find(~ismember(diff(A),[1 -11]))

ind =

     4
    16

Upvotes: 0

Identify gaps in repeated sequences

Answers (3)

Related Questions