Zeeshan
Zeeshan

Reputation: 1238

Replacing missing values in Juila

I have data-frame contain some missing values, I want to replace all the missing values with the mean of LoanAmount column

df[ismissing.(df.LoanAmount),:LoanAmount]= floor(mean(skipmissing(df.LoanAmount))) 

but when I am running above code i am getting

MethodError: no method matching setindex!(::DataFrame, ::Float64, ::BitArray{1}, ::Symbol)

Upvotes: 2

Views: 180

Answers (2)

Zeeshan
Zeeshan

Reputation: 1238

I found this one also, when we need to replace with mean

replace!(df.col,missing => floor(mean(skipmissing(df[!,:col]))))

when we need to replace with mode

replace!(df.col,missing => mode(skipmissing(df[!,:col]))) 

Upvotes: 1

Przemyslaw Szufel
Przemyslaw Szufel

Reputation: 42194

Use skipmissing e.g.:

mean(skipmissing(df.LoanAmount))

Answer to the second, edited question: you should broadcast the assignment using the dot operator (.) as in the example below:

julia> df = DataFrame(col=rand([missing;1:3],10))                                                      
10×1 DataFrame                                                                                         
│ Row │ col     │                                                                                      
│     │ Int64?  │                                                                                      
├─────┼─────────┤                                                                                      
│ 1   │ missing │                                                                                      
│ 2   │ 3       │                                                                                      
│ 3   │ 2       │                                                                                      
│ 4   │ 2       │                                                                                      
│ 5   │ missing │                                                                                      
│ 6   │ missing │                                                                                      
│ 7   │ missing │                                                                                      
│ 8   │ 3       │                                                                                      
│ 9   │ 1       │                                                                                      
│ 10  │ 3       │                                                                                      
                                                                                                       
julia> df[ismissing.(df.col),:col] .= floor(mean(skipmissing(df.col)));                                 
                                                                                                       
julia> df                                                                                              
10×1 DataFrame                                                                                         
│ Row │ col    │                                                                                       
│     │ Int64? │                                                                                       
├─────┼────────┤                                                                                       
│ 1   │ 2      │                                                                                       
│ 2   │ 3      │                                                                                       
│ 3   │ 2      │                                                                                       
│ 4   │ 2      │                                                                                       
│ 5   │ 2      │                                                                                       
│ 6   │ 2      │                                                                                       
│ 7   │ 2      │                                                                                       
│ 8   │ 3      │                                                                                       
│ 9   │ 1      │                                                                                       
│ 10  │ 3      │                                                                                       

Impute.jl

yet another option is to use Impute.jl as suggested by Bogumil:

Impute.fill(df;value=(x)->floor(mean(x)))

Upvotes: 3

Related Questions