K M Rakibul Islam
K M Rakibul Islam

Reputation: 34338

what is the time complexity of Array#uniq method in ruby?

Can anyone please tell me which algorithm is internally used by ruby to remove duplicates from an ruby array using Array#uniq method?

Upvotes: 8

Views: 4434

Answers (5)

user4127768
user4127768

Reputation:

The time complexity is linear time i.e. O(n) as it uses Hash for the internal implementation of the algorithm.

Upvotes: 1

Jörg W Mittag
Jörg W Mittag

Reputation: 369468

This depends on which "internals" you are talking about. There are 7 production-ready Ruby implementations in current use, and the Ruby Language Specification does not prescribe any particular algorithm. So, it really depends on the implementation.

E.g., this is the implementation Rubinius uses:

Rubinius.check_frozen

if block_given?
  im = Rubinius::IdentityMap.from(self, &block)
else
  im = Rubinius::IdentityMap.from(self)
end
return if im.size == size

array = im.to_array
@tuple = array.tuple
@start = array.start
@total = array.total

self

And this is the one from JRuby:

RubyHash hash = makeHash();
if (realLength == hash.size()) return makeShared();

RubyArray result = new RubyArray(context.runtime, getMetaClass(), hash.size()); 

int j = 0;
try {
    for (int i = 0; i < realLength; i++) {
        IRubyObject v = elt(i);
        if (hash.fastDelete(v)) result.values[j++] = v;
    }
} catch (ArrayIndexOutOfBoundsException aioob) {
    concurrentModification();
}
result.realLength = j;
return result;

Upvotes: 3

soulcheck
soulcheck

Reputation: 36767

Amortized O(n) as it uses Hash internally.

Upvotes: 3

rohit89
rohit89

Reputation: 5773

It compares elements using their hash (provided by the Object#hash method) then compares hashes with Object#eql?.

Source: https://github.com/ruby/ruby/blob/trunk/array.c#L3976

Upvotes: 1

Woot4Moo
Woot4Moo

Reputation: 24316

From the docs:

static VALUE
rb_ary_uniq(VALUE ary)
{
    VALUE hash, uniq, v;
    long i;

    if (RARRAY_LEN(ary) <= 1)
        return rb_ary_dup(ary);
    if (rb_block_given_p()) {
        hash = ary_make_hash_by(ary);
        uniq = ary_new(rb_obj_class(ary), RHASH_SIZE(hash));
        st_foreach(RHASH_TBL(hash), push_value, uniq);
    }
    else {
        hash = ary_make_hash(ary);
        uniq = ary_new(rb_obj_class(ary), RHASH_SIZE(hash));
        for (i=0; i<RARRAY_LEN(ary); i++) {
            st_data_t vv = (st_data_t)(v = rb_ary_elt(ary, i));
            if (st_delete(RHASH_TBL(hash), &vv, 0)) {
                rb_ary_push(uniq, v);
            }
        }
    }
    ary_recycle_hash(hash);

    return uniq;

It has O(N) complexity

Upvotes: 7

Related Questions