Amin Karimi
Amin Karimi

Reputation: 407

Testing for heteroskedasticity and autocorrelation in large unbalanced panel data

I want to test for heteroskedasticity and autocorrelation in a large unbalanced panel dataset.

I do so using the following code:

* Heteroskedasticity test

// iterated GLS with only heteroskedasticity produces 
// maximum-likelihood parameter estimates

xtgls adjusted_volume ibn.rounded_time i.id i.TRD_EVENT_DT, igls panels(heteroskedastic)
estimates store hetero 

* Autocorrelation

findit xtserial
net sj 3-2 st0039
net install st0039

xtserial adjusted_volume ibn.rounded_time i.id i.TRD_EVENT_DT

Though I use the calculation power of high process center, because of the iteration method, this procedure takes more than 15 hours.

What is the most efficient program to perform these tests using Stata?

Upvotes: 1

Views: 1225

Answers (1)

user8682794
user8682794

Reputation:

This question is borderline off-topic and quite broad, but i suspect still of considerable interest to new users. As such, here i will try to consolidate our conversation in the comments as an answer.

I strongly advise in the future to refrain from using highly subjective words such as 'best', which can mean different things to different people. Or terms like 'efficient', which can have a different meaning in a different context. It is also difficult to provide specific advice regarding the use of commands when we know nothing about what you are trying to do.

In my view, the 'best' choice, is the choice that gets the job done as accurately as possible given the available data. Speed is an important consideration nowadays, but accuracy is still the most fundamental one. As you continue to use Stata, you will see that it has a considerable number of commands, often with overlapping functionality. Depending on the use case, sometimes opting for one implementation over another can be 'better', in the sense that it may be more practical or faster in achieving the desired end result.

Case in point, your comment in your previous post where the noconstant option is unavailable in rreg. In that particular context you can get a reasonably good alternative using regress with the vce(robust) option. In fact, this alternative may often be adequate for several use cases.

In this particular example, xtgls will be considerably faster if the igls option is not used. This will be especially true with larger and more 'difficult' datasets. In cases where MLE is necessary, the iterate option will allow you to specify a fixed number of iterations, which could speed things up but can be a recipe for disaster if you don't know what you are doing and is thus not recommended. This option is usually used for other purposes. However, is xtgls the only command you could use? Read here why this may in fact not necessarily be the case.

Regarding speed, Stata in general is slow, at least when the ado language is used. This is because it is an interpreted language. The only realistic option for speed gains here is through parallelisation if you have Stata MP. Even in this case, whether any gains are achieved it will depend on a number of factors, including which command you use.

Finally, xtserial is a community-contributed command, something which you fail to make clear in your question. It is customary and useful to provide this information right from the start, so others know that you do not refer to an official, built-in command.

Upvotes: 3

Related Questions