Reputation: 5459
in Julia 1.3
How is it possible to search items in a dataframe with several conditions. Here is an example with the iris.csv
dataset (downloadable here)
df = CSV.read(".../iris.csv");
df[1:6,:]
6 rows × 5 columns
sepal_length sepal_width petal_length petal_width species
Float64 Float64 Float64 Float64 String
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
Let's say I want to select the indices of rows with the sepal_length equal to 5.1:
findall(df[:,1] .== 5.1)
9-element Array{Int64,1}:
1
18
20
22
24
40
45
47
99
Same now with selecting indices with species "setosa":
findall(df[:,5] .== "setosa")[1:10]
10-element Array{Int64,1}:
1
2
3
4
5
6
7
8
9
10
Let's say now that I want to select the indices of rows with the sepal_length equal to 5.1 AND species "setosa" (I tried a similar syntax as the function which()
in R
):
findall(df[:,1] .== 5.1 & df[:,5] .== "setosa")
MethodError: no method matching &(::Float64, ::PooledArrays.PooledArray{String,UInt32,1,Array{UInt32,1}})
Closest candidates are:
&(::Any, ::Any, !Matched::Any, !Matched::Any...) at operators.jl:529
Stacktrace:
[1] top-level scope at In[149]:1
Which command should I use instead?
Upvotes: 2
Views: 56
Reputation: 10137
You need to broadcast the &
operator (note the parentheses and the dot before the &
),
findall((df[:,1] .== 5.1) .& (df[:,5] .== "setosa"))
Note, however, that df[:,1] .== 5.1
and df[:,5] .== "setosa"
both allocate temporary arrays. Consider using the version of findall
which takes a function as the first argument like so:
findall(x -> x.sepal.length == 5.1 && x.species == "setosa", eachrow(df))
Upvotes: 4