Skip to content

Conversation

asgersvenning
Copy link

This PR follows from the discussion in #358.

The PR resolves behavior with the draw.gam function when using terms like ti(x, fac, bs=c("tp", "re")) or ti(x, y, fac, bs=c("tp", "re")).
Previously an error:

Error in map():
ℹ In index: 2.
Caused by error in Summary.factor():
! ‘min’ not meaningful for factors
Run rlang::last_trace() to see where the error occurred.

would be raised when there was only one continuous term (x) term, but the error was caused by invalid behavior, not by a check. When there was two continuous terms it would mysteriously work, but in fact, this turns out to be invalid behavior also.

The first reason for this can be found via the stack trace:

 Backtrace:
     ▆
  1. ├─... %>% draw
  2. ├─gratia::draw(.)
  3. └─gratia:::draw.gam(.)
  4.   └─base::eval(.call, parent.frame())
  5.     └─base::eval(.call, parent.frame())
  6.       ├─gratia::assemble(object = .)
  7.       └─gratia:::assemble.gam(object = .)
  8.         ├─gratia::smooth_estimates(...)
  9.         └─gratia:::smooth_estimates.gam(...)
 10.           └─purrr::map(...)
 11.             └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
 12.               ├─purrr:::with_indexed_errors(...)
 13.               │ └─base::withCallingHandlers(...)
 14.               ├─purrr:::call_with_cleanup(...)
 15.               ├─gratia (local) .f(.x[[i]], ...)
 16.               └─gratia:::eval_smooth.tensor.smooth(.x[[i]], ...)
 17.                 ├─gratia::too_far_to_na(...)
 18.                 └─gratia:::too_far_to_na.tensor.smooth(...)
 19.                   └─gratia:::.call_too_far(...)
 20.                     └─gratia::too_far(...)
 21.                       └─mgcv::exclude.too.far(x, y, ref_1, ref_2, dist = dist)
 22.                         └─base::Summary.factor(`<fct>`, na.rm = FALSE)
 23.                           └─base::stop(gettextf("%s not meaningful for factors", sQuote(.Generic)))

Which also explains why it works for one continuous variable + random effect, but not not two continuous variables + random effect, since this check:

gratia/R/smooth-estimates.R

Lines 988 to 996 in 03bb56e

# set some values to NA if too far from the data
if (smooth_dim(smooth) == 2L && (!is.null(dist) && dist > 0)) {
eval_sm <- too_far_to_na(smooth,
input = eval_sm,
reference = model[["model"]],
cols = c(".estimate", ".se"),
dist = dist
)
}

Makes sure that gratia:::eval_smooth.tensor.smooth only calls gratia:::too_far_to_na when there are two variables in total, not three or more.

This issue could then be fixed by adding a type check around:
if (is.numeric(input[[sm_vars[1L]]]) && is.numeric(input[[sm_vars[2L]]])) {

gratia/R/too_far.R

Lines 165 to 171 in 03bb56e

ind <- too_far(
x = input[[sm_vars[1L]]],
y = input[[sm_vars[2L]]],
ref_1 = reference[[sm_vars[1L]]],
ref_2 = reference[[sm_vars[2L]]],
dist = dist
)

in gratia:::.call_too_far, which fixes the error in the first case to go away, but now both cases have the same undefined behaviour (the plots treat the variable as continuous). To identify why the wrong type of plot was used, we have to look at the heuristic in gratia::draw_smooth_estimates:

gratia/R/smooth-estimates.R

Lines 1198 to 1237 in 03bb56e

if (sm_dim == 1L &&
sm_type %in% c(
"TPRS", "TPRS (shrink)", "CRS", "CRS (shrink)",
"Cyclic CRS", "Cyclic P spline", "P spline", "B spline", "Duchon spline",
"GP",
"Mono inc P spline",
"Mono dec P spline",
"Convex P spline",
"Concave P spline",
"Mono dec conv P spline",
"Mono dec conc P spline",
"Mono inc conv P spline",
"Mono inc conc P spline",
"Mono inc 0 start P spline",
"Mono inc 0 start P spline"
)) {
class(object) <- append(class(object), "mgcv_smooth", after = 0L)
} else if (grepl("1d Tensor product", sm_type, fixed = TRUE)) {
class(object) <- append(class(object), "mgcv_smooth", after = 0L)
} else if (sm_type == "Random effect") {
class(object) <- append(class(object),
c("random_effect", "mgcv_smooth"),
after = 0
)
} else if (sm_type == "Factor smooth") {
class(object) <- append(class(object),
c("factor_smooth", "mgcv_smooth"),
after = 0
)
} else if (sm_type == "Constr. factor smooth") {
class(object) <- append(class(object),
c("sz_factor_smooth", "mgcv_smooth"),
after = 0
)
} else if (sm_type == "SOS") {
class(object) <- append(class(object),
c("sos", "mgcv_smooth"),
after = 0
)
} else if (sm_dim == 2L) {

Since sm_type is always "Tensor product int. for ti and ("Tensor product int." for te) _[defined as such from smooth_type.tensor.smooth`]_ ,none of the 1D cases are able to capture the term, even if it only has one continuous basis.
To address this I add a separate condition to the "Factor smooth" case:

...
  } else if (
    sm_type == "Factor smooth" || (
      sm_type %in% c("Tensor product int.", "Tensor product") && 
      any(map_lgl(object[sm_vars], is.factor))
    )) {
    class(object) <- append(class(object),
      c("factor_smooth", "mgcv_smooth"),
      after = 0
    )
...

This way tensor products with a random effect basis are handled by the "Factor smooth" plotting function, which both properly handles the univariate continuous case properly, and properly raises a warning for the bivariate case.

In the last case (bivariate) it did unfortunately have the side-effect of causing an error later in gratia:::assemble.gam because the skipped term wasn't removed from the evaluated smooths, however this was fixed simply by slicing both the list of generated plots and the data frame of evaluated smooths by the NULL mask (indicating plot skipped due to missing implementation):

    no_plot <- map_lgl(sm_plts, is.null)
    sm_plts <- sm_plts[!no_plot]
+   sm_eval <- sm_eval[!no_plot,]

I also added unit tests that makes sure that the univariate case works, while the bivariate case properly skips the term, without throwing an error.

PS: This is my first PR to a public repo I am not personally involved in, so feel free to tell me if I did something wrong ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants