Starbucks
Starbucks

Reputation: 1568

SAS Regression with Specific Parameters in the Variables

Good afternoon.

Is it possible to regress variables with specifics, such as Y = X1 (Between Q1 and Q3) X2 (X2 > 100) X3? I do not wish to regress all of the data in X1 or X2, just the data in the parameters that I determine.

What about regressing all the variables between quantiles Q3 and Q4?

Am I approaching this the correct way?

Thank you for your support.

 *Regression output;
 ods graphics on;
 proc reg data=mydata PLOTS(ONLY)=(DIAGNOSTICS FITPLOT RESIDUALS);
 model Y = X1 X2 X3; 
 title 'Working Regression Model';
 run;
 ods graphics off;

Upvotes: 0

Views: 77

Answers (1)

Dirk Horsten
Dirk Horsten

Reputation: 3845

If you want to do a regression using only part of your data, you can filter it out in your proc reg itself:

proc reg data=mydata (where=(X1 Between Q1 and Q3 and X2 > 100))
    PLOTS(ONLY)=(DIAGNOSTICS FITPLOT RESIDUALS);
model Y = X1 X2 X3; 

That is if Q1 and Q3are fields in myData. If not, you can create macro variables for them. For instance

%let Q1 = 50;
%let Q3 = 250;
proc reg data=mydata (where=(X1 Between &Q1. and &Q3. and X2 > 100))
    PLOTS(ONLY)=(DIAGNOSTICS FITPLOT RESIDUALS);
model Y = X1 X2 X3; 

This is if you know Q1 and Q3 upfront.

If Q1 and Q3are fields in another dataset, proc sql with select Q1, Q3. into :Q1, :Q3 or a data step with call symput('Q1', Q1); call symput('Q3', Q3); can do the job.

Edit after reading your comment explaining Q1 etc are quantiles:

The following example might contain all the building blocks you need

ods graphics on;
proc means data=SASHELP.CLASS noprint;
    var Height Weight;
    output out=CLASS_Q
        P25(Height Weight)=Height_Q1 Weight_Q1
        P75(Height Weight)=Height_Q3 Weight_Q3;
run;
data _NULL_;
    set CLASS_Q;
    call symput('Height_Q1', Height_Q1);
    call symput('Height_Q3', Height_Q3);
    call symput('Weight_Q1', Weight_Q1);
    call symput('Weight_Q3', Weight_Q3);
run;
%put _user_; *just to debug;
proc reg data=SASHELP.CLASS 
    (where=(Height Between &Height_Q1. and &Height_Q3. and Weight > &Weight_Q1.))
    PLOTS(ONLY)=(DIAGNOSTICS FITPLOT RESIDUALS);
    model Age = Height Weight;
run;

Upvotes: 1

Related Questions