Reputation: 2542
I want to create a indexed subset of a DataFrame and use a variable inside it. In this case i want to change all -9999 values of the first column to NA's. If I do: df[df[:1] .== -9999, :1] = NA
it works like it should.. But if i use a variable as the indexer it througs an error (LoadError: KeyError: key :i not found):
i = 1
df[df[:i] .== -9999, :i] = NA
Upvotes: 2
Views: 1869
Reputation: 8566
:i
is actually a symbol in julia:
julia> typeof(:i)
Symbol
you can define a variable binding to a symbol like this:
julia> i = Symbol(2)
Symbol("2")
then you can simply use df[df[i] .== 1, i] = 123
:
julia> df
10×1 DataFrames.DataFrame
│ Row │ 2 │
├─────┼─────┤
│ 1 │ 123 │
│ 2 │ 2 │
│ 3 │ 3 │
│ 4 │ 4 │
│ 5 │ 5 │
│ 6 │ 6 │
│ 7 │ 7 │
│ 8 │ 8 │
│ 9 │ 9 │
│ 10 │ 10 │
It's worth noting that in your example df[df[:1] .== -9999, :1]
, :1
is NOT a symbol:
julia> :1
1
In fact, the expression is equal to df[df[1] .== -9999, 1]
which works in that there is a corresponding getindex
method whose argument (col_ind
) can accept a common index:
julia> @which df[df[1].==1, 1]
getindex{T<:Real}(df::DataFrames.DataFrame, row_inds::AbstractArray{T,1}, col_ind::Union{Real,Symbol})
Since you just want to change the first (n) column, there is no difference between Symbol("1")
and 1
as long as your column names are regularly arranged as:
│ Row │ 1 │ 2 │ 3 │...
├─────┼─────┤─────┼─────┤
│ 1 │ │ │ │...
Upvotes: 5