William Ross
William Ross

Reputation: 3910

Elixir delete occurrences that happen more then once in list

I have a list like ["A", "A", "B", "C", "D", "D", "E"] and I want to delete completely elements which occur more then once.

In this case the output would be ["B", "C", "E"]

I've been looking at using Enum.dedup. So far the best I've come up with is N^2 algorithms which find the duplicates and then filter again.

What is the best idiomatic Elixir way to accomplish this?

Upvotes: 0

Views: 139

Answers (3)

Dogbert
Dogbert

Reputation: 222118

O(n log n) solution using a list comprehension with Enum.frequencies/1:

iex(1)> list = ["A", "A", "B", "C", "D", "D", "E"]
["A", "A", "B", "C", "D", "D", "E"]
iex(2)> for {k, v} <- Enum.frequencies(list), v == 1, do: k
["B", "C", "E"]

Enum.frequencies/1 is O(n log n) since it creates a Map of the frequencies of each element and inserting one value into a Map is O(log n). Source

Note that this will not preserve the order of elements in the original list. For that, @sabiwara's answer is better.


Proposed improvement by @sabiwara in comments:

iex(3)> for {k, 1} <- Enum.frequencies(list), do: k
["B", "C", "E"]

In a for comprehension, if a pattern does not match an element, that element is skipped, which makes this a really elegant solution.

Upvotes: 4

sabiwara
sabiwara

Reputation: 3134

You can use Enum.frequencies/1 to build a map (Enum.uniq/1 also does this under the hood, plus a pass to reverse the list) and then you can simply filter:

counts = Enum.frequencies(list)
Enum.reject(list, &Map.fetch!(counts, &1) > 1)

Assuming the list is sorted (or at least that duplicated are always successive), you can do it in one pass and O(N) with recursion for instance:


defmodule Dedup do
  def dedup([]), do: []
  def dedup([h, h | t]) do
    # removing all successive instances of h
    skipped = Enum.drop_while(t, & &1 === h)
    dedup(skipped)
  end
  
  def dedup([h | t]) do
    [h | dedup(t)]
  end
end

Upvotes: 3

Adam Millerchip
Adam Millerchip

Reputation: 23091

find the duplicates and then filter again

That's the answer. You can use --, which is O(n log n).

dupes = list -- Enum.uniq(list)
Enum.reject(list, &(&1 in dupes))

Upvotes: 1

Related Questions