Lo Real
Lo Real

Reputation: 11

Local Polynomial Regression: Reading nearest neighbor smoothing constants from R code

I'm studying from a textbook on data mining and I can't figure out how the author reads the nn values from the gcv output. The code and output are below:

## cv 
alpha <- seq(0.20, 1, by = 0.01) 
n1 = length(alpha) 
g = matrix(nrow = n1, ncol = 4)
for (k in 1:length(alpha)) {
  g[k,] <- gcv(NOx ~ lp(EquivRatio, nn = alpha[k]), data = ethanol)
} 
g

enter image description here

the csv file is here: https://github.com/jgscott/ECO395M/blob/master/data/ethanol.csv

I'm usin locfit library in R.

How do you find with given output?

Upvotes: 1

Views: 134

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 174476

The nn values are not read from the output - they are given in the input. In the loop, nn is assigned as the kth value of the object alpha.

Let's look at the output of the first 16 rows of g, which is the same as the picture you included in your question:

g[1:16,]
#>            [,1]     [,2]      [,3]      [,4]
#>  [1,] -3.220084 18.81266 16.426487 0.1183932
#>  [2,] -3.249601 17.61614 15.436227 0.1154507
#>  [3,] -3.319650 16.77004 14.752039 0.1151542
#>  [4,] -3.336464 15.44404 13.889209 0.1115457
#>  [5,] -3.373011 14.52391 13.115430 0.1099609
#>  [6,] -3.408908 13.96789 12.634934 0.1094681
#>  [7,] -3.408908 13.96789 12.634934 0.1094681
#>  [8,] -3.469254 12.99316 11.830996 0.1085293
#>  [9,] -3.504310 12.38808 11.283837 0.1078784
#> [10,] -3.529167 11.93838 10.928859 0.1073628
#> [11,] -3.546728 11.46960 10.516520 0.1065792
#> [12,] -3.552238 11.26372 10.322329 0.1061728
#> [13,] -3.576083 11.03575 10.135243 0.1062533
#> [14,] -3.679128 10.54096  9.662613 0.1079229
#> [15,] -3.679128 10.54096  9.662613 0.1079229
#> [16,] -3.699044 10.46534  9.578396 0.1082955

Note that rows 11, 12 and 13 were created inside your loop using alpha[11], alpha[12] and alpha[13]. These values were passed to the nn argument of lp. If you want the nn values included in your table, all you need to do is:

cbind(g, nn = alpha)
#>                                                  nn
#>  [1,]  -3.220084 18.812657 16.426487 0.1183932 0.20
#>  [2,]  -3.249601 17.616143 15.436227 0.1154507 0.21
#>  [3,]  -3.319650 16.770041 14.752039 0.1151542 0.22
#>  [4,]  -3.336464 15.444040 13.889209 0.1115457 0.23
#>  [5,]  -3.373011 14.523910 13.115430 0.1099609 0.24
#>  [6,]  -3.408908 13.967891 12.634934 0.1094681 0.25
#>  [7,]  -3.408908 13.967891 12.634934 0.1094681 0.26
#>  [8,]  -3.469254 12.993165 11.830996 0.1085293 0.27
#>  [9,]  -3.504310 12.388077 11.283837 0.1078784 0.28
#> [10,]  -3.529167 11.938379 10.928859 0.1073628 0.29
#> [11,]  -3.546728 11.469598 10.516520 0.1065792 0.30
#> [12,]  -3.552238 11.263716 10.322329 0.1061728 0.31
#> [13,]  -3.576083 11.035752 10.135243 0.1062533 0.32
#> [14,]  -3.679128 10.540964  9.662613 0.1079229 0.33
#> [15,]  -3.679128 10.540964  9.662613 0.1079229 0.34
#> [16,]  -3.699044 10.465337  9.578396 0.1082955 0.35

Upvotes: 1

Related Questions