Reputation: 1262

Regression with both robust (white) standard errors and CLASS variable for fixed effects

proc glm makes it easy to add fixed effects without creating dummy variables for every possible value of the class variable.

proc reg is able to calculate robust (White) standard errors, but it requires you to create individual dummy variables.

Is there any way to combine these functionalities? I'd like to be able to add a number of class variables and receive White standard errors in my output. For example:

With proc glm, I can do this regression. This will give correct results no matter how many levels are contained in the class variables, but it won't calculate robust standard errors.

proc glm data=ds1;
  class class1 class2 class3;
  weight n;
  model y = c class1 class2 class3 / solution;
run;

with proc reg, I can do :

proc reg data=ds2;
  weight n;
  model y = x / white;
run;

Which has white standard errors, but doesn't incorporate the fixed effects. To do that, I might need 50 or more dummy variables and a model statement like model y = x class1_d1 class1_d2 ... class3_dn /white;. Would turn into a crazy number or dummy variables if I started adding interaction terms.

Obviously I could write a macro to create the dummy variables, but this seems like such a basic function that I can't help but think I am missing something obvious (STATA and R both have ways to do this easily). Why can't I either use the class statement in proc reg or get robust standard errors out of proc glm?

Upvotes: 1

Answers (4)

oceanswanderer

Reputation: 1

try proc robustreg using method= m (wf=huber) ... that will get you to robust standard errors although not strictly speaking "White's". I find it to be a resource hog, so not feasible over some 100k observations @ 64 obs per individual.

Upvotes: 0

Jon

Reputation: 1

I think I have an answer for this (or at least, if I don't, I might find out by posting my solution here).

According to this page one can compute robust standard errors with proc surveyreg by clustering the data so that each observation is its own cluster. Like this:

data mydata;
set mydata;
counter=_n_;
run;

proc surveyreg data=mydata;
cluster counter;
model y=x;
run;

But proc surveyreg takes a class statement, so that one can run e.g.

proc surveyreg data=mydata;
class t;
cluster counter;
model y= t x*t / solution;
run;

Upvotes: 0

user3690331

Reputation: 1

I think you can: (1) remove observations with missing variables (2) demean the independent variables using proc standard (3) regress the dependent variables on the demeaned independent variables

http://pages.stern.nyu.edu/~adesouza/sasfinphd/index/node60.html http://pages.stern.nyu.edu/~adesouza/sasfinphd/index/node61.html

The coefficients from the above procedure are exactly the same as those from proc glm (Frisch-Waugh Theorem). But, you do not have to create dummies (which is your main problem). To get robust standard errors, you can simply use proc reg on step(3) with white standard errors.

Hope that helps.

Upvotes: 0

o.h

Reputation: 1262

I think I found part of the answer although I would be interested in other solutions or tweaks to this one.

proc glmmod can be used to create the dataset for proc reg:

proc glmmod noprint outdesign=ds2 data=ds1;
  class class1 class2 class3;
  weight n;
  model y = c class1 class2 class3;
run;

proc reg data=ds2;
  weight n;
  model y = col2-col50 / white;
run;

proc glmmod uses the GLM syntax and outputs a regression dataset with all of the dummy variables that proc reg needs.

Not as clean as a single-PROC solution (and you have to keep track of the labels to see what ColXX refers to), but it seems to work perfectly.

Upvotes: 2

Regression with both robust (white) standard errors and CLASS variable for fixed effects

Answers (4)

Related Questions