Jonathan Whiteley
Jonathan Whiteley

Reputation: 377

Why does summary() output label factor levels differently for models, depending on previous commands?

I am working on a large R project to perform different analyses of a common dataset. I have built up several individual scripts for each analysis, as well as high-level scripts to call each one in sequence. Each script starts by calling an init.R script that wipes the memory ( rm(list=ls(all=TRUE)) ).

I have recently discovered that summary() (and, I think coef()) produces different output, depending on the order of the scripts. In scripts that fit models using lm() or gam() (mgcv package), if these are run first, in a "fresh" R session, the summary() output lists factors with the full labels.

However, if I run other scripts first, which use simple nested aov() functions and produce some graphs and other output using some other packages, then re-run the previously-mentioned scripts, summary() instead produces output with factor levels labeled using numbers (the 'coded' values, not the actual factor level labels).

This is not something I can easily "reproduce" using a minimal working example, unfortunately, because I haven't quite pinpointed where in my scripts this behaviour changes. I have confirmed a few things in quick tests:

Ideally, I would like my project to be able to reproduce all results and output by simply source()ing each script in turn, but this strange 'bug' (in my code - I'm not blaming this on R) means that the output is not consistent and depends on the order :(

Is there anything other than objects or packages that stays in memory that could alter the way model-fitting functions work, or store factor levels in data-frames that are passed in?

EDIT

I realized the answer to the above question was the contrasts option (see below). New question:

How can you reset options() to the default settings, i.e. to the values used when R starts up? The 'factory default' is options(contrasts=c("contr.treatment","contr.poly"))) but I'm wondering if there is a way to restart to the internal defaults (in case they aren't 'factory fresh'.

Upvotes: 1

Views: 1436

Answers (1)

Jonathan Whiteley
Jonathan Whiteley

Reputation: 377

After comparing outputs, I realized I was looking at different contrasts, and remembered that the 'offending' script changed the contrasts options from the default:

options(contrasts=c("contr.sum","contr.poly"))

So, that explains all the confusion above. Hope that saves somebody else some hair-pulling. New question:

How can you reset options() to the default settings, i.e. to the values used when R starts up?

Upvotes: 1

Related Questions