Scott F
Scott F

Reputation: 37

SAS problem with curvelabelpos and xaxis in PROC SGPLOT

I am currently trying to use PROC SGPLOT in SAS to create a series plot with five lines (8th grade, 10th grade, 12th grade, College Students, and Young Adults). The yaxis is a percentage of prevalence in drug use ranging from 0-100. The xaxis is the year 1975-2019, but formatted (using proc format) so that it shows the value of year as '75-'19. I would like to label each line using its respective group (8th grade - Young Adult). But when I use:

proc sgplot data = save.fig2_1data noautolegend ;
series x=year y=eighth / lineattrs=(color=orange) curvelabel='8th Grade' curvelabelpos=start ;
series x=year y=tenth / lineattrs=(color=green) curvelabel='10th Grade' curvelabelpos=start ;
series x=year y=twelfth / lineattrs=(color=blue) curvelabel='12th Grade' curvelabelpos=start;
series x=year y=college / lineattrs=(color=red) curvelabel='College Students' curvelabelpos=start;
series x=year y=youngadult / lineattrs=(color=purple) curvelabel='Young Adults' curvelabelpos=start ;
xaxis label="YEAR" values=(1975 to 2019 by 2) minor;
yaxis label="PERCENT" max=100 min=0 ;
format year yr. ; run ;

Series Plotenter image description here

The "curvelabelpos=" does not give the option to place my label above the first data point of "12th Grade" and "College Students" so that my xaxis does not have all of the space on the left side of the plot. How do I move these two labels above the first data point of each line so that the xaxis does not have empty space?

Upvotes: 2

Views: 1650

Answers (2)

Joe
Joe

Reputation: 63424

Richard's answered what you explicitly want, but I think what you want isn't ideal from a graphical standpoint - and that's why SAS won't do it for you.

Labelling over a line is hard to read, especially when you use the same color as the line. Labelling outside the chart is much cleaner, as is placing the labels in a keylegend.

In this case, I would use CURVELABELLOC=OUTSIDE, and either use CURVELABELPOS=MAX (default, which places them to the right of the chart), or CURVELABELPOS=MIN, which places them nearer the start as you prefer but also overlays the axis (which is not as clean-looking).

See this as an example. This is highly legible, the curve labels are in a place that the eye naturally travels to, and doesn't alter the size of the axis. Putting them at the right also means they're in the same spot for all of the lines, which is cleaner than having them at the start of the lines which are staggered.

data fig2_1data;
  call streaminit(7);
  tenth  = 0.5;
  twelfth= 0.6;
  do year=1975 to 2019;
    if year eq 1987 then eighth=0.4;
    eighth = rand('Uniform',0.2)-0.1 + eighth;
    tenth = rand('Uniform',0.2)-0.1 + tenth;
    twelfth = rand('Uniform',0.2)-0.1 + twelfth;
    output;
 end;
run;
proc sgplot data = fig2_1data noautolegend ;
series x=year y=eighth / lineattrs=(color=orange) 
                         curvelabel='8th Grade' curvelabelpos=max curvelabelloc=outside;
series x=year y=tenth / lineattrs=(color=green) 
                        curvelabel='10th Grade' curvelabelpos=max curvelabelloc=outside;
series x=year y=twelfth / lineattrs=(color=blue) 
                        curvelabel='12th Grade' curvelabelpos=max curvelabelloc=outside;
xaxis label="YEAR" values=(1975 to 2019 by 2) minor;
yaxis label="PERCENT" max=1 min=0 ;
format year yr. ; run ;

Chart example

Upvotes: 0

Richard
Richard

Reputation: 27508

There are no series statement options that will produce the labeling you want.

You will have to create an annotation data set for the sgplot.

In this sample code the curvelabel= option was set to '' so the procedure generates a series line that uses the widest amount of horizontal drawing space. The sganno data set contains the annotation functions that will draw your own curvelabel text near the first data point of the series with the blank curvelabel. Adjust the %sgtext anchor= value as needed. Be sure to read the SG Annotation Macro Dictionary documentation to understand all the text annotation capabilities.

For the case of wanting an artificial split in the series lines there are two things to try:

  • introduce a fake year, 2012.5, for which none of the series variables have a value. I tried this but only 1 of 5 series drew with a 'fake' split.
  • introduce N new variables for the N lines needing a split. For the post split time frame copy the data into the new variables and set the original to missing.
    • add SERIES statements for the new variables.
data have;
  call streaminit(1234);

  do year = 1975 to 2019;
    array response eighth tenth twelfth college youngadult;

    if year >= 1991 then do;
      eighth = round (10 + rand('uniform',10), .1);
      tenth = eighth + round (5 + rand('uniform',5), .1);
      twelfth = tenth + round (5 + rand('uniform',5), .1);

      if year in (1998:2001) then tenth = .;
    end;
    else do;
      twelfth = 20 + round (10 + rand('uniform',25), .1);
    end;

    if year >= 1985 then do;
      youngadult = 25 + round (5 + rand('uniform',20), .1);
    end;

    if year >= 1980 then do;
      college = 35 + round (7 + rand('uniform',25), .1);
    end;

    if year >= 2013 then do _n_ = 1 to dim(response);
      %* simulate inflated response level;
      if response[_n_] then response[_n_] = 1.35 * response[_n_];
    end;

    output;
  end;
run;

data have_split;
  set have;
  array response  eighth  tenth  twelfth  college  youngadult;
  array response2 eighth2 tenth2 twelfth2 college2 youngadult2;

  if year >= 2013 then do _n_ = 1 to dim(response);
    response2[_n_] = response[_n_];
    response [_n_] = .;
  end;
run;

ods graphics on;
ods html;

%sganno;

data sganno;
  %* these variables are used to track '1st' or 'start' point 
  %* of series being annotated
  ;
  retain y12 ycl;

  set have;
  if missing(y12) and not missing(twelfth)  then do; 
    y12=twelfth;
    %sgtext(label="12th Grade", textcolor="blue", drawspace="datavalue", anchor="top", x1=year, y1=y12, width=100, widthunit='pixel')
  end;     

  if missing(ycl) and not missing(college) then do; 
    ycl=college; 
    %sgtext(label="College Students", textcolor="red", drawspace="datavalue", anchor="bottom", x1=year, y1=ycl, width=100, widthunit='pixel')
  end;
run;


proc sgplot data=have_split noautolegend sganno=sganno;
series x=year y=eighth     / lineattrs=(color=orange) curvelabel='8th Grade'        curvelabelpos=start;*auto curvelabelloc=outside ;
series x=year y=tenth      / lineattrs=(color=green)  curvelabel='10th Grade'       curvelabelpos=start;*auto curvelabelloc=outside ;
series x=year y=twelfth    / lineattrs=(color=blue)   curvelabel='' curvelabelpos=start;*auto curvelabelloc=outside ;
series x=year y=college    / lineattrs=(color=red)    curvelabel='' curvelabelpos=start;*auto curvelabelloc=outside ;
series x=year y=youngadult / lineattrs=(color=purple) curvelabel='Young Adults'     curvelabelpos=start;*auto curvelabelloc=outside ;

* series for the 'shifted' time period use the new variables;
series x=year y=eighth2     / lineattrs=(color=orange) ;
series x=year y=tenth2      / lineattrs=(color=green)  ;
series x=year y=twelfth2    / lineattrs=(color=blue)   ;
series x=year y=college2    / lineattrs=(color=red)    ;
series x=year y=youngadult2 / lineattrs=(color=purple) ;

xaxis label="YEAR" values=(1975 to 2019 by 2) minor;
yaxis label="PERCENT" max=100 min=0 ;
run ;

ods html close;
ods html;

enter image description here

Upvotes: 2

Related Questions