invoketheshell
invoketheshell

Reputation: 3897

How to create overlaid histograms sas with short format data?

I have data like this:

var1  target
1.2   X
2     Y
2.3   Z

I want to overlay the histograms with percents to look like something like this:

Histogram image

The graphs could be stacked as well as long as they are comparative. I have tried this but it doesn't work:

proc univariate data=mydata;
  var var1;
  by target;
  histogram;
run;

Upvotes: 0

Views: 5784

Answers (2)

Rick
Rick

Reputation: 1210

In SAS 9.4m3, the OVERLAY option was added to the HISTOGRAM statement in PROC UNIVARIATE. That means you can now get the graph you want directly from PROC UNIVARIATE:

proc Univariate data=sashelp.iris;
class Species;
var SepalLength;
histogram SepalLength / kernel overlay;
run;

In PROC SGPLOT, SAS 9.4m2 introduced support for the GROUP= option on the HISTOGRAM statement. So you if you prefer PROC SGPLOT, you can convert the data to long form and use the GROUP= option.

proc sgplot data=sashelp.iris;
histogram SepalLength / group=Species  transparency=0.5;
density SepalLength/ group=Species type=kernel;
run;

For more on overlaying and paneling histograms, see the article "Comparative histograms: Panel and overlay histograms in SAS"

Upvotes: 1

Joe
Joe

Reputation: 63424

It's very easy to get them together in a panel:

data have_data;
  call streaminit(7);
  do _j = 1 to 1e3;
      do _i = 1 to 3;
        target=byte(_i+120);
        var1=rand('Normal',_i,0.5);
        output;
      end;
  end;
run;

proc sgpanel data=have_data;
  panelby target/columns=1;
  histogram var1;
  density var1;
run;

That's not overlaid, of course. Overlaid is more challenging, and I think requires some additional steps.

To do overlaid, the simplest option is probably to split the var1 into three variables, one per target value. (For other target values, it can be blank.) Then you create three histograms and density plots all in one SGPLOT call.

data want;
  set have_Data;
  array vars[3];
  vars[rank(target)-120] = var1;
run;
title;
proc sgplot data=want noautolegend;
  histogram vars1/name='x' legendlabel='x';
  histogram vars2/name='y' legendlabel='y';
  histogram vars3/name='z' legendlabel='z';
  density vars1;
  density vars2;
  density vars3;
  keylegend 'x' 'y' 'z'/position=top;
run;

I think you can also do this using gtl if you know that and/or are comfortable learning it, as that allows you to overlay the histograms, and if that is desired I can probably mock something up.

Upvotes: 1

Related Questions