Model selection criteria

Hi all,

My name is Enton. I’m a biology graduate student.
I’m relatively new to R (and stats, in general). My current project involves finding common trends from a bunch of time series. Initially I tried the MARSS package, but it is quite slow when dealing with larger sample size.
Recently I’ve been trying to use the greta package to perform dynamic factor analysis, based on the project Mark and Nick created (https://mdscheuerell.github.io/gretaDFA/). A problem I came acorss is to compare different models.There seem to be no built in functions to get a model selection criteria (e.g. DIC). I read the othre thread (https://forum.greta-stats.org/t/deviance-information-criterion-dic-in-greta/161/7) but am still confused about how to calculate it…
Any help would be appreciated!

1 Like

Hi @EntonZ, not sure if you’re still pursuing this, but maybe this helps.

Below is an experimental function that I pulled from another project, so it may not work well. Let me know. You’ll probably need to load the loo package. So the idea is to borrow the various suggestions in the post you referred, calculate the log-likelihood (using dnorm for normal response, for example), and then borrow the loo or waic functions from the loo package to do the rest. This is assuming that you want LOOIC or WAIC as the model selection criteria.

Let me know if this works or not. We probably need to generalise the code…

#' Obtain LOO-IC or WAIC from \code{greta} Output
#'
#' @param observed Vector of the observed response variable
#' @param posterior MCMC posterior from a \code{greta} model
#' @param mean Name of the mean/location parameter, defaults to \code{mu}
#' @param scale Name of the scale/variance parameter, defaults to \code{sd}
#' @param family Distribution of the observational model. Currently, only
#' \code{normal}, \code{lognormal}, and \code{student} are implemented
#' @param method
#' \code{"loo"} to calculate the Leave-One-Out Information Criteria or
#' \code{"waic"} for the Widely Applicable Information Criteria in the \code{loo} package
#' @param moment_match Logical, whether to use moment_match for loo.
#' @param ... additional arguments passed to distribution functions
#'
#' @return LOO-IC or WAIC; see \code{?loo::loo} for more details.
#'
#' @examples
#' obs <- iris$Sepal.Length
#' loo_greta(obs, posterior_lognormal, mean = "mean", scale = "sd",
#'           family = "lognormal", method = "loo")
#' loo_greta(obs, posterior_normal, mean = "mean", scale = "sd",
#'           family = "normal", method = "loo")
#'
#' @import loo
#' @export

loo_greta <- function(observed,
                      posterior,
                      mean = "mu",
                      scale = "sd",
                      family = c("normal", "lognormal", "student"),
                      method = c("loo", "waic"),
                      moment_match = FALSE,
                      ...) {
    # check that arguments are consistent with the implemented options
    family <- match.arg(family)
    method <- match.arg(method)

    # extract the relative variables locally
    mu <- posterior[[mean]]
    sd <- posterior[[scale]]
    nsim <- nrow(mu)

    # determine which density function should be used
    dfunc <- switch(family,
        normal    = dnorm,
        lognormal = dlnorm,
        student   = dlst
    )

    # generate a matrix of log-likelihood values for each posterior predicted value
    LL_mat <- t(sapply(
        seq_len(nsim),
        function(i,mu,sd){
            dfunc(observed, mu[i,,], sd[i,,], log = TRUE, ...)
        },
        mu = mu,
        sd = sd
    ))

    # compute the relative efficiencies of the posterior predicted values
    rel_eff <- relative_eff(exp(LL_mat), chain_id = rep(1, nsim))

    # estimate the LOO-IC, WAIC, etc
    out <- switch(method,
        loo  = loo(LL_mat, r_eff = rel_eff, moment_match = moment_match),
        waic = waic(LL_mat, r_eff = rel_eff)
    )

    return(out)

}
1 Like