Reputation: 759
I am attempting to retrieve the index of all instances of a pattern in a String vector with missing values.
E.g. How can I get a vector with the index of all instances containing the pattern "a"
from:
x = ["ab", "ca", "bc", missing, "ad"]
The desired outcome would be equal to:
Vector([1, 2, 5])
3-element Vector{Int64}:
1
2
5
As these are the indexes in which the pattern appears.
Upvotes: 4
Views: 220
Reputation: 14685
The findall
version can be even neater:
julia> findall(contains("a"), skipmissing(x))
3-element Vector{Int64}:
1
2
5
The first cool thing about this is that contains
returns a curried version of itself, when supplied only the pattern to search for. So here, the contains("a")
returns a function that searches for "a"
inside any given String argument, and we pass that function on to findall
as the predicate.
Even cooler is the way skipmissing
works. As the name indicates, it skips missing
values in its argument, but it doesn't just filter them out from x
(which would change the indices of all values after the missing
s), but by providing an iterator that just jumps past the missing
values. This means that the indices findall
returns by iterating skipmissing(x)
will be valid for x
too, which is exactly what we want.
Upvotes: 1
Reputation: 69819
A natural way to write it is:
julia> findall(v -> ismissing(v) ? false : contains(v, "a"), x)
3-element Vector{Int64}:
1
2
5
Alternatively you could write:
julia> using Missings
julia> findall(coalesce.(passmissing(contains).(x, "a"), false))
3-element Vector{Int64}:
1
2
5
which in this case is less readable, but in other contexts you might find passmissing
and coalesce
useful so I mention them.
Upvotes: 4