Reputation: 45
Instead of trying to remove outliers from a data set, I am trying to create a new data frame consisting only of the rows tha have outliers in them.
I was able to column-bind the averages and standard deviations of the different groups onto the end of the data set. Now, I have tried this code to produce a table of outlier data:
Outliers <- Sample[((Sample$x - Sample$Averages)/Sample$StDevs) > 2.00,]
This process runs, but produces an empty table for Outliers. I tested some individual values from the data to make sure outliers existed, and they do. If I specify a row, the above calculation indeed produces a Boolean argument. It is when I try to collect these outliers in a table that I have problems. I also tried initializing Outliers as a data.frame or data.table, but was unsuccessful here as well (probably just because I am new to R).
ex: When I run
((Sample$x[3] - Sample$Averages[3])/Sample$StDevs[3]) > 2
it returns TRUE. This is good. Why, then, do I get an empty table of outliers when I simply want to KEEP everything in Sample where this condition is true? I do not feel that this should be a difficult problem, but I cannot for the life of me get it to work.
Any suggestions? Thanks in advance!
Upvotes: 0
Views: 157
Reputation: 263451
Sample[ 0, ]
should get you an empty dataframe with no rows and the same column names.
Upvotes: 0