Reputation: 52
The reference page says:
Parameters:
X : array-like or sparse matrix, shape (n_samples, n_features)
Training data
y : array_like, shape (n_samples, n_targets)
Target values. Will be cast to X’s dtype if necessary
Is X the exogenous variable? I would assume so but with statsmodel OLS the endogenous comes first so I want to confirm because they yield different coefficients.
Upvotes: 1
Views: 266
Reputation: 4264
Yes you are correct, the order in which you feed your exogenous and endogenous variables are reversed in sklearn module (true for other models in sklearn as well) when compared to the statsmodel OLS module.
If X = exogenous variable and Y = endogenous
In sklearn you would do something like this:
clf.fit(X,Y)
whereas, in statsmodel you would do:
clf.fit(Y,X)
Where clf
is the model you are trying to build.
Upvotes: 1