Potential Scientist
Potential Scientist

Reputation: 207

drawing histogram and boxplot in SAS

I wrote the following code in sas, but I did not get result!

The result histogram in grey and the range of data is not as I specified! what is the problem?

I got the following warning too: WARNING: The MIDPOINTS= list was extended to accommodate the data

what about color?

axis1 order=(0 to 100000 by 50000);
axis2 order=(0 to 100 by 5);
run;
proc capability data=HW2 noprint;
histogram Mvisits/midpoints=0 to 98000 by 10000
haxis=axis1
cfill=blue;
run;

enter image description here .......................................

I have the same problem with boxplot, for example I got the following plot and I want to change the distances, then I could see the plot better, but I could not.

enter image description here

Upvotes: 2

Views: 2101

Answers (1)

SRSwift
SRSwift

Reputation: 1710

The below is for proc univariate rather than proc capability, I do not have access to SAS/QC to test, but the user guide shows very similar syntax for the histogram statements. Hopefully, you'll be able to translate it back.

It looks like you are having problems with the colour due to your output system. Your graphs are probably delivered via ODS, in which case the cfill option does not apply (see here and not the Traditional Graphics tag).

To change the colour of the histogram bars in ODS output you can use proc template:

proc template;
    define style styles.testStyle;
        parent = styles.htmlblue;
        style GraphDataDefault /
            color = green;
    end;
run;

ods listing style = styles.testStyle;

proc univariate data = sashelp.cars;
    histogram mpg_city;
run;

An example explaining this can be found here.

Alternatively you can use proc sgplot to create a histogram with more control of the colour as follows:

proc sgplot data = sashelp.cars;
    histogram  mpg_city / fillattrs = (color = red); 
run;

As to your question of truncating the histogram. It doesn't really make a great deal of sense to ignore the extreme values as it will give you an erroneous image of the distribution, which somewhat defeats the purpose of the histogram. That said, you can achieve what you are asking for with bit of a hack:

data tempData;
    set sashelp.cars;
    tempClass = 1;
run;

proc univariate data = tempData noprint;
    class tempClass;
    histogram mpg_city / maxnbin = 5 endpoints = 0 to 25 by 5;
run;

In the above a dummy class tempClass is created and then comparative histograms are requested using the class statement. maxnbins will limit the number of bins displayed only in a comparative histogram.

Your other option is to exclude (or cap) your extreme points before creating the histogram, but this will lead to slightly erroneous frequency counts/percentages/bar heights.

data tempData;
    set sashelp.cars;
    mpg_city = min(mpg_city, 20);
run;

proc univariate data = tempData noprint;
    histogram mpg_city / endpoints = 0 to 25 by 5;
run;

This is a possible approach to original question (untested as no SAS/QC or data):

proc capability data = HW2 noprint;
    histogram Mvisits / 
        midpoints = 0 to 300000 by 10000
        noplot 
        outhistogram = histData;
run;
proc sgplot data = histData;
    vbar _MIDPT_ / 
        response = _OBSPCT_ 
        fillattrs = (color = blue);
    where _MIDPT_ <= 100000;
run;

Upvotes: 3

Related Questions