shashashamti2008
shashashamti2008

Reputation: 2337

Finding the intersection of large vectors using MATLAB

How can two signals be compared in MATLAB to find their intersection? My signals are large vectors which can contain duplicate values.

I have been experimenting with the following approach using intersect, which works fine for a randomly generated signals.

% Example case
sig1 = rand(100,1);
sig2 = [sig1(end-10:end); rand(90,1)]; % a signal with imposed intersection.
[c, ia, ib] = intersect(sig1, sig2);
plot(sig2)
hold on
scatter(ib, sig2(ib), 'filled')
hold off

I am using this approach for my real data but it does not produce the correct intersection, which is due to duplicate values in the signals. So, I thought to add a very small random noise to both signals and then apply intersect, however, adding a threshold is not possible for intersect.

Could someone give me some hints on how the intersection of two large signal measurement can be found robustly and efficiently? Are there other methods? Thank you in advance.

Background:

I actually have several large recordings, sig1, sig2, sig3, .... each two consecutive recordings, e.g., sig1 and sig2 may have overlaps, which means the end of the recording for sig1 can be exactly identical to the beginning of sig2. So, my goal is to check if there is any overlap, detect them, and then remove them to be able to concatenate all of measurements: sig1, sig2, sig3, ...

I also know the order of these recordings, so the intersection can be considered as sig1(end-N:end) = sig2(1:N+1).

Upvotes: 2

Views: 165

Answers (2)

Cris Luengo
Cris Luengo

Reputation: 60645

The typical method for this is the cross-correlation (the signal processing toolbox has the function xcorr). The peak of the cross correlation indicates the delay for which the two signals are most similar. This is all you need to find out by how much the two signals overlap.

Because you want to compare the tail of one signal with the head of another, we'll apply the cross-correlation to those portions of the signal only. This does require knowing how large the overlap can be (some upper bound), which is not ideal. If the cropped section we're computing the cross-correlation for is too short (i.e. not containing the full overlap), then the computed shift will not be correct. And if it is much, much too long, the cross-correlation might not be able to identify the right shift (the peak could be hidden in the noise). Maybe someone else can take this idea and build something more robust from it...

(I don't have the signal processing toolbox installed, so I implemented it using fft and ifft instead)

% Two example signals
sig1 = rand(100,1);
sig2 = [sig1(end-10:end); rand(90,1)];

% Take the end of sig1 and the start of sig2
N = 15; % should be larger than the overlap
end1 = sig1(end-N+1:end);
start2 = sig2(1:N);

% Compute cross-correlation
xc = ifft(fft(end1).*conj(fft(start2)));

% Find peak
[~,shift] = max(xc);

% Crop signal #2
Nrep = N-shift+1
sig2_cropped = sig2(Nrep+1:end);

% Plot
clf
subplot(2,1,1)
plot(sig1)
hold on
plot(numel(sig1)-Nrep+1:numel(sig1),sig1(end-Nrep+1:end),'r.')
subplot(2,1,2)
plot(sig2)
hold on
plot(1:Nrep,sig2(1:Nrep),'r.')

A quick-and-dirty alternative that might be more robust, but slower than the above, is implementing the comparison in a loop:

Nrep = 0;
for N = 1:min(numel(sig2),numel(sig1))
   % Take the end of sig1 and the start of sig2
   end1 = sig1(end-N+1:end);
   start2 = sig2(1:N);
   % Compare
   if all(end1==start2) % possibly do this with a tolerance
      Nrep = N;
      break
   end
end

Here we start comparing with an overlap of 1 sample, and increase that one by one until we find a match. If not match is found, Nrep==0, no samples are repeated.

Upvotes: 1

Nicky Mattsson
Nicky Mattsson

Reputation: 3052

Use ismember or possibly ismembertol if you want a tolerance.

sig1 = rand(100,1);
sig2 = [rand(50,1); sig1(end-10:end); rand(50,1)]; % a signal with imposed intersection.

idx = ismember(sig1,sig2); %multiple occurences variant of IA in [C,IA,IB]=intersect(sig1,sig2)
inter = sig1(idx) %The intersection

Upvotes: 1

Related Questions