user2333346
user2333346

Reputation: 1131

Equivalent R code for Matlab: How to calculate residual values for the linear model?

I have an r code that I need to translate to Matlab as follows:

xt =  c(-0.227, -0.604,  0.974,  2.639, -0.271, -0.355, -0.551,  0.342,  2.390, -1.257)
sets = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5)
methods = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2)
wt = c( 1, 1, 1, 1, 1, 3, 3, 3, 3, 3)

sets = as.factor(sets)
methods = as.factor(methods)
lm1 <- lm(xt ~ sets + methods, weights = wt)

I need the residual values for the linear model, that is.

lm1$residual

The polyfit function does not except weights!! What function in Matlab will give me the residual values for linear model?

Upvotes: 2

Views: 1843

Answers (1)

rayryeng
rayryeng

Reputation: 104515

I'm going to assume that you have the Statistics Toolbox in MATLAB. If you don't, then this won't work.


The equivalent code in MATLAB is pretty much the same as R. All you have to do is set up a data frame that has your variables, then use fitlm or LinearModel.fit to fit your linear model. fitlm is the more recent version of LinearModel.fit and is available from R2013b and onwards. It is suggested that you use fitlm if you have versions of MATLAB later than this. If you don't, then use LinearModel.fit. lm in R fits a linear model to your predictor variables and outputs while fitlm / LinearModel.fit does the same thing in MATLAB.

What you have to do is define your variables like you have done above, but make sure you encapsulate them in a data frame using the dataset function in MATLAB. After, create factor variables by using the nominal function in MATLAB. You then create your linear model, but specify an additional flag Weights to weight each predictor and output combination using your wt variable. Once you create your linear model, you just access the residuals field through Residuals. You define the input / output relation between the predictor and output variables in the same way in R (a.k.a. Wilkinson notation).

One small note I need to point out is that you must make sure your data are in columns and not rows. You'll see that I'm putting the data in, but using the transpose operator to ensure that the data are in columns. Therefore:

% // Define data
xt = [-0.227, -0.604, 0.974, 2.639, -0.271, -0.355, -0.551, 0.342, 2.390, -1.257].';
sets = [1, 2, 3, 4, 5, 1, 2, 3, 4, 5].';
methods = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2].';
wt = [1, 1, 1, 1, 1, 3, 3, 3, 3, 3].';

%// Create data frame and make categorical data
data = dataset(xt, sets, methods);
data.sets = nominal(data.sets);
data.methods = nominal(data.methods);

%// Create linear model and specify weights
fit = LinearModel.fit(data, 'xt ~ sets + methods', 'Weights', wt);
%// or 
%// fit = fitlm(data, 'xt ~ sets + methods', 'Weights', wt);

%// Access residuals
res = fit.Residuals;

This is the linear model I get:

fit = 


Linear regression model:
    xt ~ 1 + sets + methods

Estimated Coefficients:
                   Estimate    SE         tStat       pValue    
    (Intercept)     -0.0317    0.22889    -0.13849       0.89654
    sets_2         -0.24125    0.25591    -0.94273        0.3992
    sets_3            0.823    0.25591       3.216      0.032403
    sets_4           2.7752    0.25591      10.845    0.00041025
    sets_5          -0.6875    0.25591     -2.6865      0.054855
    methods_2       -0.3884    0.18689     -2.0783       0.10623

Number of observations: 10, Error degrees of freedom: 4
Root Mean Squared Error: 0.362
R-squared: 0.983,  Adjusted R-Squared 0.962
F-statistic vs. constant model: 46.6, p-value = 0.00123

These are the residuals I get:

res = 

    Raw         Pearson     Studentized    Standardized
     -0.1953    -0.53964    -0.64365       -0.69667    
    -0.33105    -0.91474     -1.2672        -1.1809    
      0.1827     0.50483       0.597        0.65173    
    -0.10455    -0.28889    -0.32875       -0.37295    
      0.4482      1.2384      2.3047         1.5988    
      0.0651     0.17988     0.37161        0.40223    
     0.11035     0.30491     0.73161        0.68181    
     -0.0609    -0.16828    -0.34468       -0.37628    
     0.03485    0.096296      0.1898        0.21532    
     -0.1494    -0.41281     -1.3306       -0.92308  

Just to be self-contained, this is what I get from your code in R and we should see that the output is more or less the same:

> summary(lm1)

lm(formula = xt ~ sets + methods, weights = wt)

Weighted Residuals:
       1        2        3        4        5        6        7        8        9       10 
-0.19530 -0.33105  0.18270 -0.10455  0.44820  0.11276  0.19113 -0.10548  0.06036 -0.25877 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -0.0317     0.2289  -0.138  0.89654    
sets2        -0.2412     0.2559  -0.943  0.39920    
sets3         0.8230     0.2559   3.216  0.03240 *  
sets4         2.7753     0.2559  10.845  0.00041 ***
sets5        -0.6875     0.2559  -2.687  0.05486 .  
methods2     -0.3884     0.1869  -2.078  0.10623    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3619 on 4 degrees of freedom
Multiple R-squared:  0.9831,    Adjusted R-squared:  0.962 
F-statistic: 46.58 on 5 and 4 DF,  p-value: 0.001226

> lm1$residuals

       1        2        3        4        5        6        7        8        9       10 
-0.19530 -0.33105  0.18270 -0.10455  0.44820  0.06510  0.11035 -0.06090  0.03485 -0.14940 

R displays the raw residuals, and this corresponds to the first column of the Residuals matrix in MATLAB. Take note that the residuals are still encapsulated in a data frame (dataset class). If you want to extract the numerical values, you can use dataset2struct to transform each column of the dataset into a field within a structure. That way, you would just access each column using dot notation.

If you use LinearModel.fit, the data frame of residuals is returned as a dataset type. However, if you use fitlm, then the output is actually a table. In that case, you would need to use table2struct to convert the residuals into a structure with the relevant fields.

In other words, you would do something like:

resMatrix = dataset2struct(res); %// If using LinearModel.fit
%// or
%// resMatrix = table2struct(res); %// If using fitlm

This is what I get:

resMatrix = 

10x1 struct array with fields:

    Raw
    Pearson
    Studentized
    Standardized

You can then access each column by:

raw = resMatrix.Raw;
pear = resMatrix.Pearson;
stu = resMatrix.Studentized;
sta = resMatrix.Standardized;

Alternatively you can cast the output as double if you want to extract the raw 2D matrix (i.e. resMatrix = double(res)). If you do it this way, this is what you get:

resMatrix = double(res)

resMatrix =

   -0.1953   -0.5396   -0.6437   -0.6967
   -0.3311   -0.9147   -1.2672   -1.1809
    0.1827    0.5048    0.5970    0.6517
   -0.1046   -0.2889   -0.3288   -0.3730
    0.4482    1.2384    2.3047    1.5988
    0.0651    0.1799    0.3716    0.4022
    0.1103    0.3049    0.7316    0.6818
   -0.0609   -0.1683   -0.3447   -0.3763
    0.0349    0.0963    0.1898    0.2153
   -0.1494   -0.4128   -1.3306   -0.9231

This is now an actual 2D matrix where you can access the individual elements, and can perform slicing operations, filtering operations etc. to your heart's content. In your case, you need the raw residuals, and so you would do raw = resMatrix(:,1);

Upvotes: 3

Related Questions