Reputation: 133
I am wondering if there is a way to account for outlier in a histogram plot. I want to plot the frequencies of a random variable, which is very small and distributed around zero. However, in most of the cases I am considering I also have an outlier that complicates things. Is there a way to adjust the scale of the x axis in R/Matlab so that I can capture the distribution of the random variable I am considering and also show the outlier? Because normal ways to obtain the plot result in such a scale that all values are considered to be zero, and I want to show how they are distributed around zero. So ideally I would like to have the scales around zero accounting for very small numbers and than after a gap (which does not necessarily have to be proportional to the actual distance from zero) a bin to indicate the value of the outlier. And I do not want to remove the outlier from the sample.
Is such a thing possible in R/Matlab? Any other suggestions would be welcome.
Edit: The problem is not in identifying the outliers and using a different color for them. The problem is in adjusting the scales on the x-axis so I can observe the distribution of the variable as well as have the outlier included in the plot.
Upvotes: 0
Views: 3826
Reputation: 35525
The next code will do the job, but you need to change the Xticklabels of the axes in order to make them show the real value of the outliers.
A=rand(1000,1)*0.1;
A(1:10)=10;
% modify the data for plotting pourposes. Get the outliers closer
expected_maximum_value=1; % You can compute this useg 3*sigma maybe?
distance_to_outliers=0.5;
outlier_mean=mean(A(A>expected_maximum_value));
A(A>expected_maximum_value)=A(A>expected_maximum_value)-outlier_mean+distance_to_outliers;
% plot
h=histogram(A,'BinWidth',0.01)
%% trick the X axis
ax=gca;
ax.XTickLabel{end-1}=[ax.XTickLabel{end-1} '//'];
ax.XTickLabel{end}=['//' num2str(outlier_mean)];
Upvotes: 2