Reputation: 1353
I'm working on a basic problem with elixir - RNA transcription. However I'm hitting some unexpected (to me) behavior with my solution:
defmodule RnaTranscription do
@doc """
Transcribes a character list representing DNA nucleotides to RNA
## Examples
iex> RnaTranscription.to_rna('ACTG')
'UGAC'
"""
@spec to_rna([char]) :: [char]
def to_rna(dna) do
_to_rna(dna)
end
def _to_rna([]), do: ''
def _to_rna([head | tail]), do: [_rna(head) | _to_rna(tail)]
def _rna(x) when x == 'A', do: 'U'
def _rna(x) when x == 'C', do: 'G'
def _rna(x) when x == 'T', do: 'A'
def _rna(x) when x == 'G', do: 'C'
end
When the solution is run, I get errors as the _rna function is being invoked with an integer that does not match the guard clause instead of the character.
The following arguments were given to RnaTranscription._rna/1:
# 1
65
lib/rna_transcription.ex:18: RnaTranscription._rna/1
lib/rna_transcription.ex:16: RnaTranscription._to_rna/1
Is there a way to force elixir to keep the value as a character when it splits into head and tail?
Upvotes: 0
Views: 103
Reputation: 121000
Besides the difference between a list 'A'
and a character ?A
perfectly answered by Michael, this code has one more hidden but important glitch.
You use recursion that is not tail-optimized, which should be avoided at any cost. It might in general lead to stack overflow. Below is the TCO code.
defmodule RnaTranscription do
def to_rna(dna), do: do_to_rna(dna)
defp do_to_rna(acc \\ [], []), do: Enum.reverse(acc)
defp do_to_rna(acc, [char | tail]),
do: do_to_rna([do_char(char) | acc], tail)
defp do_char(?A), do: ?U
defp do_char(?C), do: ?G
defp do_char(?T), do: ?A
defp do_char(?G), do: ?C
end
RnaTranscription.to_rna('ACTG')
#⇒ 'UGAC'
or, even better, with a comprehension
converter = fn
?A -> ?U
?C -> ?G
?T -> ?A
?G -> ?C
end
for c <- 'ACTG', do: converter.(c)
#⇒ 'UGAC'
You might even filter it inplace.
for c when c in 'ACTG' <- 'ACXXTGXX',
do: converter.(c)
#⇒ 'UGAC'
Upvotes: 1
Reputation: 41
You can use the ?
code point operator:
def _rna(x) when x == ?A, do: 'U'
def _rna(x) when x == ?C, do: 'G'
def _rna(x) when x == ?T, do: 'A'
def _rna(x) when x == ?G, do: 'C'
Strictly speaking, Elixir is already keeping the value as a character! A character is a code point, which is an integer. When you match on 'A'
, you are matching on a charlist, which is a list of integers. That is, you are trying to match 65
to [65]
.
Upvotes: 1