Reputation: 5569
I have a vector that should contain n sequences from 00 to 11
A = [00;01;02;03;04;05;06;07;08;09;10;11;00;01;02;03;04;05;06;07;08;09;10;11]
and I would like to check that the sequence "00 - 11 " is always respected (no missing values).
for example if
A =[00;01;02; 04;05;06;07;08;09;10;11;00;01;02;03;04;05;06;07;08;09;10;11]
(missing 03 in the 3rd position) For each missing value I would like to have back this information in another vector
missing=
[value_1,position_1;
value_2, position_2;
etc, etc]
Can you help me?
Upvotes: 2
Views: 126
Reputation: 30589
This will give you the missing values and their positions in the full sequence:
N = 11; % specify the repeating 0:N sub-sequence
n = 3; % reps of sub-sequence
A = [5 6 7 8 9 10 3 4 5 6 7 8 9 10 11 0 1 2 3 4 6 7 8]'; %' column from s.bandara
da = diff([A; N+1]); % EDITED to include missing end
skipLocs = find(~(da==1 | da==-N));
skipLength = da(skipLocs)-1;
skipLength(skipLength<0) = N + skipLength(skipLength<0) + 1;
firstSkipVal = A(skipLocs)+1;
patchFun = @(x,y)(0:y)'+x - (N+1)*(((0:y)'+x)>N);
patches = arrayfun(patchFun,firstSkipVal,skipLength-1,'uni',false);
locs = arrayfun(@(x,y)(x:x+y)',skipLocs+cumsum([A(1); skipLength(1:end-1)])+1,...
skipLength-1,'uni',false);
Then putting them together, including any missing values at the beginning:
>> gapMap = [vertcat(patches{:}) vertcat(locs{:})-1]; % not including lead
>> gapMap = [repmat((0 : A(1) - 1)',1,2); gapMap] %' including lead
gapMap =
0 0
1 1
2 2
3 3
4 4
11 11
0 12
1 13
2 14
5 29
9 33
10 34
11 35
The first column contains the missing values. The second column is the 0-based location in the hypothetical full sequence.
>> Afull = repmat(0:N,1,n)
>> isequal(gapMap(:,1), Afull(gapMap(:,2)+1)')
ans =
1
Upvotes: 1
Reputation: 5664
For sure we know that the last element must be 11, so we can already check for this and make our life easier for testing all previous elements. We ensure that A
is 11-terminated, so an "element-wise change" approach (below) will be valid. Note that the same is true for the beginning, but changing A
there would mess with indices, so we better take care of that later.
missing = [];
if A(end) ~= 11
missing = [missing; 11, length(A) + 1];
A = [A, 11];
end
Then we can calculate the change dA = A(2:end) - A(1:end-1);
from one element to another, and identify the gap positions idx_gap = find((dA~=1) & (dA~=-11));
. Now we need to expand all missing indices and expected values, using ev
for the expected value. ev
can be obtained from the previous value, as in
for k = 1 : length(idx_gap)
ev = A(idx_gap(k));
Now, the number of elements to fill in is the change dA
in that position minus one (because one means no gap). Note that this can wrap over if there is a gap at the boundary between segments, so we use the modulus.
for n = 1 : mod(dA(idx_gap(k)) - 1, 12)
ev = mod(ev + 1, 12);
missing = [missing; ev, idx_gap(k) + 1];
end
end
As a test, consider A = [5 6 7 8 9 10 3 4 5 6 7 8 9 10 11 0 1 2 3 4 6 7 8]. That's a case where the special initialization from the beginning will fire, memorizing the missing 11 already, and changing A to [5 6 ... 7 8 11]. missing
then will yield
11 24 % recognizes improper termination of A.
11 7
0 7 % properly handles wrap-over here.
1 7
2 7
5 21 % recognizes single element as missing.
9 24
10 24
which should be what you are expecting. Now what's missing still is the beginning of A
, so let's say missing = [0 : A(1) - 1, 1; missing];
to complete the list.
Upvotes: 2
Reputation: 112749
Although this doesn't solve your problem completely, you can identify the position of missing values, or of groups of contiguous missing values, like this:
ind = 1+find(~ismember(diff(A),[1 -11]));
ind
gives the position with respect to the current sequence A
, not to the completed sequence.
For example, with
A =[00;01;02; 04;05;06;07;08;09;10;11;00;01;02;03; ;06;07;08;09;10;11];
this gives
>> ind = 1+find(~ismember(diff(A),[1 -11]))
ind =
4
16
Upvotes: 0