Reputation: 105
I am writing a method that finds outliers and print them to the user alongside a special symbol that indicates the outlier type. The outliers could be calculated in two ways: the Engineer's method or Tukey's method. The function takes two parameters: a data frame with one column of random numbers and option value that determines the method to be used in calculating the outliers. The function will return a data frame with two columns the value of the outlier and its type as a symbol (o or *)
When I call the function and I ask it to calculate the outliers using the engineer's method it works perfectly. However, using the Tukey's method it generates the following error (data frame with 0 columns and 0 rows)
The following is my code:
findOutliers<- function(numbers,option){
outlierM=c()
outlierE=c()
outlier=c()
typeM=c()
typeE=c()
type=c()
length=nrow(numbers)
print(numbers)
if(option=="eng"){
print("Engineer Methods")
numbersmean= as.numeric(sapply(numbers,mean))
numbersd= as.numeric(sapply(numbers,sd))
for(i in 1:length){
zscore= as.numeric((i-numbersmean)/numbersd)
if(zscore>2 & zscore<3){
#cat(zscore," ", "O","\n")
outlierM =c(outlierM,zscore)
typeM=c(typeM, "O")
}#end of if statment
else if(zscore>3){
#cat(zscore," ", "*", "\n")
outlierE =c(outlierE,zscore)
typeE=c(typeE, "*")
}#end of if statment
}#end of for loop
}#end of if statment
else if(option=="tukey"){
print("Tuckey's Methods")
sortedNumbers=numbers[order(numbers$Numbers), ]
IQR=IQR(sortedNumbers)
Q1=as.numeric(quantile(sortedNumbers,0.25))
Q3=as.numeric(quantile(sortedNumbers,0.75))
rangeM1=Q1 - (1.5 * IQR)
rangeM2=Q3 + (1.5 * IQR)
rangeE1=Q1 - (3 * IQR)
rangeE2=Q3 + (3 * IQR)
for(i in 1:length){
if(numbers[i,]<rangeM1|numbers[i,]>rangeM2){
outlierM=c(outlierM,numbers[i])
typeM=c(typeM, "O")
}#end of if statment
else if(numbers[i,]<rangeE1|numbers[i,]>rangeE2){
outlierE=c(outlierE, numbers[i])
typeE=c(typeE, "*")}
}# end of for loop
}#end of if statment
outlier= c(outlierM,outlierE)
type=c(typeM,typeE)
founOtliers<- data.frame(Outliers=outlier,Type=type)
return(founOtliers)
}#end of function
normalnumbers=rnorm(10)
randomNumbers<- data.frame(Numbers=normalnumbers)
findOutliers(randomNumbers,"eng")
findOutliers(randomNumbers,"tukey")
Upvotes: 0
Views: 6638
Reputation: 43255
2 things. First, I suggest indenting your code and using spaces where possible for clarity.
if (x = 1) {
print ('foo')
} else {
print ('bar')
}
Second, and more importantly, you are using the if/else syntax incorrectly (see my example above). From ?"if"
:
In particular, you should not have a newline between ‘}’ and
‘else’ to avoid a syntax error in entering a ‘if ... else’
However, that is not the problem, per say, in your code. If you add the lines
print (paste('first check',
numbers[i, ] < rangeM1 | numbers[i, ] > rangeM2))
print (paste('second check',
numbers[i, ] < rangeE1 | numbers[i, ] > rangeE2))
at the top of your second for
loop, you'll see that you never satify either if
condition, thus you return your empty data.frame
...
In general, if you're using the if
else if
syntax, I think it is wise to always include a final else
catchall that can provide some helpful advice or a default output.
Upvotes: 4