Saket_MS
Saket_MS

Reputation: 11

How to do outlier treatment in SAS

I have Household ID's and their respective sales. As it turn out there are few of these HH ID's who have extremely high Total Sales. Can you guys please suggest a good method for the outlier treatment. It will be great if you suggest in SAS.

Regards, Saket

Upvotes: 1

Views: 4579

Answers (1)

rambles
rambles

Reputation: 706

The following is a basic, rather crude method. It involves removing values more than 3 standard deviations from the mean:-

**  Standardise data;
proc standard data=sales_data mean=0 std=1 out=sales_data_std;
  var sales;
run;

**  Remove values more than 3 std devs from mean;
data sales_data_no_outliers;
  set sales_data_std;
  where sales < -3 or sales > 3;
run;

There's a reference to this approach in Wikipedia.

Still, it's crude; it relies on your variable being normally distributed and will almost always find outliers (if n > 100) even if, in all reasonableness, the values are not really outlying.

The subject of outliers is long and detailed but a cursory overview of the topic might be useful. Unfortunately, I can't really think of any introductory sources off-hand.

Upvotes: 2

Related Questions