Reputation: 237

how to find the first element that has been seen twice in a vector

I want to find the first element that has appeared in previous positions in a vector.

For example, if the vector is:

v = [1, 3, 2, 3, 4, 5];

the answer is v(4) = 3, since 3 is the first element that has been seen twice.
Is there a way to vectorize this operation?

Update:
Here's my current solution, do you have better suggestions?

[s o] = sort(v);  % sort the array
d = diff(s);      % the first zero corresponds to the first repetitive element  
d = find(d == 0);

o(d(1) + 1) is the index of the first element that has been seen twice.

New Update:
Following @mwengler's solution, I now come up the solution to find the first repeated element of each row of a MATRIX.

function vdup = firstDup(M)
    [SM Ord] = sort(M, 2);    % sort by row
    [rows cols] = find(~diff(SM, 1, 2));   % diff each row, and find indices of the repeated elements in sorted rows
    Mask = (size(M,2) + 1) * ones(size(M)); % create a Mask matrix with all size(M,2)+1
    ind = sub2ind(size(Ord), rows, cols+1); % add 1 to the column indices
    Mask(ind) = Ord(ind);   % get the original indices of each repeated elements in each row
    vdup = min(Mask, [], 2); % get the minimum indices of each row, which is the indices of first repeated element

Upvotes: 2

Answers (4)

Luis Mendo

Reputation: 112679

idx = find(any(triu(bsxfun(@eq, v, v.'), 1)), 1);
el = v(idx);

How this works: bsxfun(...) compares each entry of v with each other. triu(...,1) keeps only matches with a previous element (keeps only values above the diagonal). any tells which entries have a match with some previous entry. find(...,1) gives the index of the first such entry.

Upvotes: 1

mwengler

Reputation: 2778

The answer @Fash originally proposed ALMOST works. Going further down his path:

sv = sort(v);
repeated = sv(~diff(sv));
ifr = find(ismember(v,repeated),'first');
ir2 = find(v==v(ifr));
index_desired = ir2(2);
value_desired = v(index_desired);

Upvotes: 1

Ansari

Reputation: 8218

This will work. @Steve pointed out an error in your updated solution.

[~, ~, Iv] = unique(v, 'stable');
idx = find(diff(Iv)-1, 1)+1;
el = v(idx);

After this, el will contain the first repeated element in v and idx will be its index in v.

First you use stable unique to find the unique elements. The second output argument contains the original indices of each unique element. You then run diff(Iv) - 1 to find the jumps in the original indices. You use find(, 1) to get the first element and add one to get the index in the original vector. Index into the original vector to get the element you want.

Upvotes: 3

Fallenreaper

Reputation: 10694

store in a hashtable which you can check to already have contents?

something like:

If (hash.hasValue(i))
  return true;
else
  hash.insert(i, 1);
  return false;

where i is the key, the position, and can contain just something simple, a bit for example to allow for a small structure size.

Upvotes: -1

how to find the first element that has been seen twice in a vector

Answers (4)

Related Questions