Catherine Laing
Catherine Laing

Reputation: 657

'Factors are not allowed' error when running bam() in mgcv() - R

I have a dataset as follows:

 str(templates)

 tibble [2,179 x 8] (S3: grouped_df/tbl_df/tbl/data.frame)
  $ Speaker        : Factor w/ 5 levels "Alex","Lily",..: 1 1 1 1 1 1 1 1 1 1 ...
  $ session_ordinal: num [1:2179] 1 1 1 2 2 2 3 3 3 4 ...
  $ structure      : Factor w/ 32 levels "C","CCV","CCVC",..: 5 20 26 5 6 5 5 6 20 4 ...
  $ structurePC    : num [1:2179] 0.55 0.15 0.1 0.3636 0.0303 ...
  $ structure.ord  : Ord.factor w/ 32 levels "C"<"CCV"<"CCVC"<..: 5 20 26 5 6 5 5 6 20 4 ...
  - attr(*, "groups")= tibble [220 x 4] (S3: tbl_df/tbl/data.frame)
   ..$ Speaker        : chr [1:220] "Alex" "Alex" "Alex" "Alex" ...
   ..$ session_ordinal: num [1:220] 1 2 2 3 4 4 5 5 6 6 ...
   ..$ .rows          :List of 220

I'm attempting to run a GAMM model on the data using bam() in the mgcv package. I can run a simple model successfully:

 templates.gam.simple <- bam(structurePC ~ 
                       s(session_ordinal, k = 18) +
                       s(session_ordinal, Speaker, bs = "fs", m = 1, k = 5),
                    data=subset(templates, structurePC >= .1 & session_ordinal < 19), method="ML")

But as soon as I add additional factors (which I need to do in order to run my analyses) I start to get error messages. If I add another smooth (s(Speaker, k = 5)) to the model, as in:

 templates.gam.1 <- bam(structurePC ~ 
                            s(session_ordinal, k = 18) +
                            s(Speaker, k = 5) +
                            s(session_ordinal, Speaker, bs = "fs", m = 1, k = 5),
                       data=subset(templates, structurePC >= .1 & session_ordinal < 19), method="ML")

I get the error message:

 Error in smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : 
   NA/NaN/Inf in foreign function call (arg 1)
 In addition: Warning messages:
 1: In mean.default(xx) : argument is not numeric or logical: returning NA
 2: In Ops.factor(xx, shift[i]) : ‘-’ not meaningful for factors

If I try to add an interaction to the initial model (ti(session_ordinal, Speaker, k = c(18, 5)) +):

 templates.gam.2 <- bam(structurePC ~ 
                       s(session_ordinal, k = 18) +
                       ti(session_ordinal, Speaker, k = c(18, 5)) +
                       s(session_ordinal, Speaker, bs = "fs", m = 1, k = 5),
                    data=subset(templates, structurePC >= .1 & session_ordinal < 19), method="ML")

I get:

 Error in quantile.default(xu, seq(0, 1, length = nk)) : 
   factors are not allowed

Upvotes: 1

Views: 1537

Answers (1)

Gavin Simpson
Gavin Simpson

Reputation: 174908

You can't make a standard spline from a factor variable. When you use the "fs" basis, you pass a continuous variable (x say) and a factor variable (f say):

s(x, f, bs = 'fs')

What this is doing is setting up splines in x for each level of f where the splines share the same smoothness penalty. These terms also include a random intercept for group means of f.

This is a special basis construction.

If you want to include random intercepts for other factor variables, then you need to use another special basis type, the random effect or "re" basis:

s(Speaker, bs = 're')

I'm not sure what you mean by including the ti() term; the "fs" smooth is already an interaction between session_ordinal and Speaker. Looking at session_ordinal suggests this might not be a factor but you may have coded it via integers 1,2,...,n? If you explain what model you want (what effects you want) I can suggest solutions...

...but in general you can include a factor in a tensor product smooth if you tell mgcv to use a random effect basis for the factor marginal term:

te(x, f, bs = c('cr', 're'))

where cr means cubic regression spline and is the default basis for tensor products. You can do this with t2() and ti() too.

Upvotes: 3

Related Questions