ecjb
ecjb

Reputation: 5459

Search indices in a dataframe with several conditions with findall() function

in Julia 1.3

How is it possible to search items in a dataframe with several conditions. Here is an example with the iris.csv dataset (downloadable here)

df = CSV.read(".../iris.csv");
df[1:6,:]

6 rows × 5 columns
    sepal_length    sepal_width petal_length    petal_width species
    Float64 Float64 Float64 Float64 String
1   5.1 3.5 1.4 0.2 setosa
2   4.9 3.0 1.4 0.2 setosa
3   4.7 3.2 1.3 0.2 setosa
4   4.6 3.1 1.5 0.2 setosa
5   5.0 3.6 1.4 0.2 setosa
6   5.4 3.9 1.7 0.4 setosa

Let's say I want to select the indices of rows with the sepal_length equal to 5.1:

findall(df[:,1] .== 5.1)

9-element Array{Int64,1}:
  1
 18
 20
 22
 24
 40
 45
 47
 99

Same now with selecting indices with species "setosa":

findall(df[:,5] .== "setosa")[1:10]
10-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

Let's say now that I want to select the indices of rows with the sepal_length equal to 5.1 AND species "setosa" (I tried a similar syntax as the function which() in R):

findall(df[:,1] .== 5.1 & df[:,5] .== "setosa")

MethodError: no method matching &(::Float64, ::PooledArrays.PooledArray{String,UInt32,1,Array{UInt32,1}})
Closest candidates are:
  &(::Any, ::Any, !Matched::Any, !Matched::Any...) at operators.jl:529

Stacktrace:
 [1] top-level scope at In[149]:1

Which command should I use instead?

Upvotes: 2

Views: 56

Answers (1)

carstenbauer
carstenbauer

Reputation: 10137

You need to broadcast the & operator (note the parentheses and the dot before the &),

findall((df[:,1] .== 5.1) .& (df[:,5] .== "setosa"))

Note, however, that df[:,1] .== 5.1 and df[:,5] .== "setosa" both allocate temporary arrays. Consider using the version of findall which takes a function as the first argument like so:

findall(x -> x.sepal.length == 5.1 && x.species == "setosa", eachrow(df))

Upvotes: 4

Related Questions