ivo Welch
ivo Welch

Reputation: 2866

summary.lm output customization

I would like my lm summary output to be a little more compact than usual. I want to remove some newlines, the "Residuals" section, the line with the word "Coefficients". on the positive side, summary.lm is written as a native R function, so presumably I can just copy it to a file, change it, and then source it through my .Rprofile. on the negative side, when I try the first step (copy into emacs and source it), it complains that qr.lm is not found. is there magic, or am I missing something?

how do I redefine it?

summary.lm <- function(object, correlation = FALSE, symbolic.cor = FALSE,
      print.residstable = TRUE, succinct = FALSE, ...)

whatever I will get is not ideal. if someone upstream makes a change in summary.lm, I will have to redo my code. still, in the absence of parameters to control the printing verbosity, I don't know how else to do this.

Upvotes: 4

Views: 3798

Answers (3)

ivo Welch
ivo Welch

Reputation: 2866

it is the print.summary.lm function that needs to be changed, not summary.lm. here is a version that adds a 'concise' option, taking care not to change anything when concise is false :

print.summary.lm <- 
function (x, digits = max(3L, getOption("digits") - 3L), symbolic.cor = x$symbolic.cor,
          signif.stars = getOption("show.signif.stars"), concise = FALSE, ...)
    {
        cat("\nCall:", if(!concise) "\n" else " ", paste(deparse(x$call), sep = "\n", collapse = "\n"),
            if (!concise) "\n\n", sep = "")
        resid <- x$residuals
        df <- x$df
        rdf <- df[2L]
        if (!concise) {
            cat(if (!is.null(x$weights) && diff(range(x$weights)))
                    "Weighted ", "Residuals:\n", sep = "")
        }
        if (rdf > 5L) {
            nam <- c("Min", "1Q", "Median", "3Q", "Max")
            rq <- if (length(dim(resid)) == 2L)
                      structure(apply(t(resid), 1L, quantile), dimnames = list(nam,
                                                                   dimnames(resid)[[2L]]))
                  else {
                      zz <- zapsmall(quantile(resid), digits + 1L)
                      structure(zz, names = nam)
                  }
            if (!concise) print(rq, digits = digits, ...)
        }
        else if (rdf > 0L) {
            print(resid, digits = digits, ...)
        }
        else {
            cat("ALL", df[1L], "residuals are 0: no residual degrees of freedom!")
            cat("\n")
        }
        if (length(x$aliased) == 0L) {
            cat("\nNo Coefficients\n")
        }
        else {
            if (nsingular <- df[3L] - df[1L])
                cat("\nCoefficients: (", nsingular, " not defined because of singularities)\n",
                    sep = "")
            else { cat("\n"); if (!concise) cat("Coefficients:\n")  }
            coefs <- x$coefficients
            if (!is.null(aliased <- x$aliased) && any(aliased)) {
                cn <- names(aliased)
                coefs <- matrix(NA, length(aliased), 4, dimnames = list(cn,
                                                            colnames(coefs)))
                coefs[!aliased, ] <- x$coefficients
            }
            printCoefmat(coefs, digits = digits, signif.stars = signif.stars, signif.legend = (!concise),
                         na.print = "NA", eps.Pvalue = if (!concise) .Machine$double.eps else 1e-4, ...)
        }
        cat("\nResidual standard error:", format(signif(x$sigma,
                                                        digits)), "on", rdf, "degrees of freedom")
        cat("\n")
        if (nzchar(mess <- naprint(x$na.action)))
            cat("  (", mess, ")\n", sep = "")
        if (!is.null(x$fstatistic)) {
            cat("Multiple R-squared: ", formatC(x$r.squared, digits = digits))
            cat(",\tAdjusted R-squared: ", formatC(x$adj.r.squared,
                                                   digits = digits), "\nF-statistic:", formatC(x$fstatistic[1L],
                                                       digits = digits), "on", x$fstatistic[2L], "and",
                x$fstatistic[3L], "DF,  p-value:", format.pval(pf(x$fstatistic[1L],
                                                                  x$fstatistic[2L], x$fstatistic[3L], lower.tail = FALSE),
                                                               digits = digits, if (!concise) .Machine$double.eps else 1e-4))
            cat("\n")
        }
        correl <- x$correlation
        if (!is.null(correl)) {
            p <- NCOL(correl)
            if (p > 1L) {
                cat("\nCorrelation of Coefficients:\n")
                if (is.logical(symbolic.cor) && symbolic.cor) {
                    print(symnum(correl, abbr.colnames = NULL))
                }
                else {
                    correl <- format(round(correl, 2), nsmall = 2,
                                     digits = digits)
                    correl[!lower.tri(correl)] <- ""
                    print(correl[-1, -p, drop = FALSE], quote = FALSE)
                }
            }
        }
        cat("\n")
        invisible(x)
    }

and now

x <- rnorm(100); y <- rnorm(100)+x
print(summary(lm(y ~ x)))
print(summary(lm(y ~ x)), concise=TRUE)

with the first print delivering the standard R printing results and the latter delivering

Call: lm(formula = y ~ x)
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   -0.010      0.102   -0.10     0.92
x              1.009      0.112    9.02  <0.0001 ***

Residual standard error: 1.02 on 98 degrees of freedom
Multiple R-squared:  0.454, Adjusted R-squared:  0.448
F-statistic: 81.4 on 1 and 98 DF,  p-value: <0.0001

PS: this takes statistics for real data more appropriately serious: the p-value for individual coefficients is now limited to 0.0001 and not to the machine precision.

PPS: and if the R team is listening, IMHO this should be a standard R feature.

Upvotes: 3

rawr
rawr

Reputation: 20811

This is more of an aside than an a to your q

One infrequently-used (I think) way of editing functions in packages is edit which is not only a good way to get the source code nicely-formatted but it also uses the namespace so that you do not have to go searching for qr.lm and redefine it in the global or whatever you need to do for the function to find it

I fit this lm and do the summary, it is very verbose

(tmp <- summary(fit <- lm(mpg ~ disp, data = mtcars)))

# Call:
#   lm(formula = mpg ~ disp, data = mtcars)
# 
# Residuals:
#   Min      1Q  Median      3Q     Max 
# -4.8922 -2.2022 -0.9631  1.6272  7.2305 
# 
# Coefficients:
#   Estimate Std. Error t value Pr(>|t|)    
# (Intercept) 29.599855   1.229720  24.070  < 2e-16 ***
#   disp        -0.041215   0.004712  -8.747 9.38e-10 ***
#   ---
#   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 3.251 on 30 degrees of freedom
# Multiple R-squared:  0.7183,  Adjusted R-squared:  0.709 
# F-statistic: 76.51 on 1 and 30 DF,  p-value: 9.38e-10

edit it and basically replace all the code with function (x) qr.lm(x) and note that qr.lm is not exported meaning that you need to explicitly tell r where to look or it won't work as seen below in my_summ2

Here is the new definition, notice I don't have to use stats:::qr.lm and the environment where this new function lives

(my_summ <- edit(stats:::print.summary.lm))
# function (x) qr.lm(x)
# <environment: namespace:stats>

This is how you might try to do the same thing, but the environment is the global one now

(my_summ2 <- function (x) qr.lm(x))
# function (x) qr.lm(x)

So I can try to use both but only the first works

my_summ(fit)
# $qr
# (Intercept)          disp
# Mazda RX4            -5.6568542 -1.305160e+03
# Mazda RX4 Wag         0.1767767  6.900614e+02
# Datsun 710            0.1767767  1.624463e-01
# Hornet 4 Drive        0.1767767 -5.492561e-02
# Hornet Sportabout     0.1767767 -2.027385e-01
# Valiant               0.1767767 -7.103778e-03
# ...

my_summ2(fit)
# Error in my_summ2(fit) : could not find function "qr.lm"

But both are in the global

ls()
# [1] "fit"      "my_summ"  "my_summ2" "tmp" 

Upvotes: 1

asachet
asachet

Reputation: 6921

Indeed, redefining summary.lm is the way to go for what you want to do.

What you are missing is the concept of namespace in R. summary.lmis a function from the stats package and so has access to internal functions of this package. Only some functions from a package are exported and available once the package is loaded.

qr.lm is precisely such an internal function. It is accessible with the triple ::: operator (see ?/:::``):

> qr.lm
Error: object 'qr.lm' not found

> stats::qr.lm
Error: 'qr.lm' is not an exported object from 'namespace:stats'

> stats:::qr.lm
function (x, ...) 
{
    if (is.null(r <- x$qr)) 
        stop("lm object does not have a proper 'qr' component.\n Rank zero or should not have used lm(.., qr=FALSE).")
    r
}
<bytecode: 0x0000000017983b68>
<environment: namespace:stats>

As you can see, it simply extracts the qr component of the lm object. You can just paste the code instead of calling the function.

Upvotes: 3

Related Questions