Reputation: 4827
I'm just started exploring Julia and am struggeling with subsetting dataframes. I would like to select rows where LABEL
has the value "B" and VALUE
is missing. Selecting rows with "B" works fine, but trying to add a filter for missing fails. Any suggestions how to solve this. Tips for good documentation on subsetting/filtering dataframes in Julia are welcome. In the Julia documentation I haven't found a solution.
using DataFrames
df = DataFrame(ID = 1:5, LABEL = ["A", "A", "B", "B", "B"], VALUE = ["A1", "A2", "B1", "B2", missing])
df[df[:LABEL] .== "B", :] # works fine
df[df[:LABEL] .== "B" && df[:VALUE] .== missing, :] # fails
Upvotes: 4
Views: 1429
Reputation: 69949
Use:
filter([:LABEL, :VALUE] => (l, v) -> l == "B" && ismissing(v), df)
(a very similar example is given in the documentation of the filter
function).
If you want to use getindex
then write:
df[(df.LABEL .== "B") .& ismissing.(df.VALUE), :]
The fact that you need to use .&
instead of &&
when working with arrays is not DataFrames.jl specific - this is a common pattern in Julia in general when indexing arrays with booleans.
Upvotes: 5