user2502904
user2502904

Reputation: 319

R Computation failed in `stat_smooth()`: # x has insufficient unique values to support 10 knots: reduce k

I am following the last set of code in https://drsimonj.svbtle.com/plot-some-variables-against-many-others, and have modified the code for my data.

In this code:

  t3 %>%
  gather(-Border, key = "var", value = "value") %>%
  ggplot(aes(x = value, y = Border)) +
  geom_point() +
  stat_smooth() +
  facet_wrap(~ var, scales = "free") +
  theme_bw() 

I get this error message: Computation failed in stat_smooth(): x has insufficient unique values to support 10 knots: reduce k.

The code runs without the stat_smooth() command but I need the smooth line.

With the exception of one var with 20 values, every other var has between 5 and 6 unique values. How do I reduce k? Is a k of 5 reasonable? The sample size is 1,000.

Thanks

Upvotes: 5

Views: 3919

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 173793

Obviously, we don't have your data, but we can generate some data that reproduces your problem:

library(ggplot2)

set.seed(1)

df <- data.frame(x = rep(1:10, 150), y = rnorm(1500), 
                 group = c(rep("A", 1490), rep("B", 10)))[-1500,]

ggplot(df, aes(x, y, color = group)) + stat_smooth()
#> `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
#> Warning: Computation failed in `stat_smooth()`
#> Caused by error in `smooth.construct.cr.smooth.spec()`:
#> ! x has insufficient unique values to support 10 knots: reduce k.

The reason for this error is that if any of your groups has over 1,000 data points, stat_smooth will by default use a generalized additive model on all your groups. From the docs:

For method = NULL the smoothing method is chosen based on the size of the largest group (across all panels). stats::loess() is used for less than 1,000 observations; otherwise mgcv::gam() is used with formula = y ~ s(x, bs = "cs")

This means that if one or more of the groups is very small, stat_smooth will end up running a gam regression with these default settings, which will fail due to the number of points being inadequate for the specified model.

We can fix this by specifying method = "loess" in stat_smooth

ggplot(df, aes(x, y, color = group)) + stat_smooth(method = "loess")
#> `geom_smooth()` using formula = 'y ~ x'

Created on 2022-11-14 with reprex v2.0.2

Upvotes: 9

Related Questions