Reputation: 31898
I am expecting the code below to print chr7
.
import strutils
var splitLine = "chr7 127471196 127472363 Pos1 0 +".split()
var chrom, startPos, endPos = splitLine[0..2]
echo chrom
Instead it prints @[chr7, 127471196, 127472363]
.
Is there a way to unpack multiple values from sequences at the same time?
And what would the tersest way to do the above be if the elements weren't contiguous? For example:
var chrom, startPos, strand = splitLine[0..1, 5]
Gives the error:
read_bed.nim(8, 40) Error: type mismatch: got (seq[string], Slice[system.int], int literal(5))
but expected one of:
system.[](a: array[Idx, T], x: Slice[system.int])
system.[](s: string, x: Slice[system.int])
system.[](a: array[Idx, T], x: Slice[[].Idx])
system.[](s: seq[T], x: Slice[system.int])
var chrom, startPos, strand = splitLine[0..1, 5]
^
Upvotes: 7
Views: 3354
Reputation: 4471
import std/strscans
var line = "chr7 127471196 127472363 Pos1 0 +"
var (_, chrom, startPos, endPos) = line.scanTuple("$w$s$i$s$i")
echo chrom # chr7
The _
is a success:bool which indicates if the scan was successful. It is discarded for this demo.
$w (word), $s (space), $i (integer)
Upvotes: 0
Reputation: 1473
yet another option is package definesugar
:
import strutils, definesugar
# need to use splitWhitespace instead of split to prevent empty string elements in sequence
var splitLine = "chr7 127471196 127472363 Pos1 0 +".splitWhitespace()
echo splitLine
block:
(chrom, startPos, endPos) := splitLine[0..2]
echo chrom # chr7
echo startPos # 127471196
echo endPos # 127472363
block:
(chrom, startPos, strand) := splitLine[0..1] & splitLine[5] # splitLine[0..1, 5] not supported
echo chrom
echo startPos
echo strand # +
# alternative syntax
block:
(chrom, startPos, *_, strand) := splitLine
echo chrom
echo startPos
echo strand
see https://forum.nim-lang.org/t/7072 for recent discussion
Upvotes: 1
Reputation: 31
For anyone that's looking for a quick solution, here's a nimble package I wrote called unpack.
You can do sequence and object destructuring/unpacking with syntax like this:
someSeqOrTupleOrArray.lunpack(a, b, c)
[a2, b2, c2] <- someSeqOrTupleOrArray
{name, job} <- tim
tom.lunpack(job, otherName = name)
{job, name: yetAnotherName} <- john
Upvotes: 3
Reputation: 8720
This can be accomplished using macros.
import macros
macro `..=`*(lhs: untyped, rhs: tuple|seq|array): auto =
# Check that the lhs is a tuple of identifiers.
expectKind(lhs, nnkPar)
for i in 0..len(lhs)-1:
expectKind(lhs[i], nnkIdent)
# Result is a statement list starting with an
# assignment to a tmp variable of rhs.
let t = genSym()
result = newStmtList(quote do:
let `t` = `rhs`)
# assign each component to the corresponding
# variable.
for i in 0..len(lhs)-1:
let v = lhs[i]
# skip assignments to _.
if $v.toStrLit != "_":
result.add(quote do:
`v` = `t`[`i`])
macro headAux(count: int, rhs: seq|array|tuple): auto =
let t = genSym()
result = quote do:
let `t` = `rhs`
()
for i in 0..count.intVal-1:
result[1].add(quote do:
`t`[`i`])
template head*(count: static[int], rhs: untyped): auto =
# We need to redirect this through a template because
# of a bug in the current Nim compiler when using
# static[int] with macros.
headAux(count, rhs)
var x, y: int
(x, y) ..= (1, 2)
echo x, y
(x, _) ..= (3, 4)
echo x, y
(x, y) ..= @[4, 5, 6]
echo x, y
let z = head(2, @[4, 5, 6])
echo z
(x, y) ..= head(2, @[7, 8, 9])
echo x, y
The ..=
macro unpacks tuple or sequence assignments. You can accomplish the same with var (x, y) = (1, 2)
, for example, but ..=
works for seqs and arrays, too, and allows you to reuse variables.
The head
template/macro extracts the first count
elements from a tuple, array, or seqs and returns them as a tuple (which can then be used like any other tuple, e.g. for destructuring with let
or var
).
Upvotes: 6
Reputation: 26530
Currently pattern matching in Nim only works with tuples
. This also makes sense, because pattern matching requires a statically known arity. For instance, what should happen in your example, if the seq
does not have a length of three? Note that in your example the length of the sequence can only be determined at runtime, so the compiler does not know if it is actually possible to extract three variables.
Therefore I think the solution which was linked by @def- was going in the right direction. This example uses arrays, which do have a statically known size. In this case the compiler knows the tuple arity, i.e., the extraction is well defined.
If you want an alternative (maybe convenient but unsafe) approach you could do something like this:
import macros
macro extract(args: varargs[untyped]): typed =
## assumes that the first expression is an expression
## which can take a bracket expression. Let's call it
## `arr`. The generated AST will then correspond to:
##
## let <second_arg> = arr[0]
## let <third_arg> = arr[1]
## ...
result = newStmtList()
# the first vararg is the "array"
let arr = args[0]
var i = 0
# all other varargs are now used as "injected" let bindings
for arg in args.children:
if i > 0:
var rhs = newNimNode(nnkBracketExpr)
rhs.add(arr)
rhs.add(newIntLitNode(i-1))
let assign = newLetStmt(arg, rhs) # could be replaced by newVarStmt
result.add(assign)
i += 1
#echo result.treerepr
let s = @["X", "Y", "Z"]
s.extract(a, b, c)
# this essentially produces:
# let a = s[0]
# let b = s[1]
# let c = s[2]
# check if it works:
echo a, b, c
I do not have included a check for the seq
length yet, so you would simply get out-of-bounds error if the seq does not have the required length. Another warning: If the first expression is not a literal, the expression would be evaluated/calculated several times.
Note that the _
literal is allowed in let bindings as a placeholder, which means that you could do things like this:
s.extract(a, b, _, _, _, x)
This would address your splitLine[0..1, 5]
example, which btw is simply not a valid indexing syntax.
Upvotes: 2