Reputation: 2693
I'm using the Weka application and using a CVS file, I need to remove the instances with missing values. I tried to use the multi filter and use the removevalues filter, but I think I am doing it wrong since it filters ALL my instances. How do I do this right exactly?
Upvotes: 0
Views: 3184
Reputation: 6284
To remove instances with missing values from a few attributes you can use weka.filters.unsupervised.instance.SubsetByExpression
and use an expression such as
not ismissing(ATT5)
to remove instances with missing values in the attribute with index 5, or
not (ismissing(ATT5) or ismissing(ATT8))
to remove instances with missing values in attributes 5 or 8, and so on.
If you were trying to use the RemoveWithValues
filter, it can be done this way but you need to clear the nominalIndices
field (removing the -L
argument from the filter command) and set a splitPoint
value more negative than the minimum value of the attribute being filtered. Otherwise this filter will match any instance whose value matches any of these conditions.
I can't see any obvious way of removing instances that have missing values in any attribute, other than building an expression for SubsetByExpression
that checks all of them one by one.
Upvotes: 3