Imperfect detection abundance model

hrlai · March 29, 2021, 9:35pm

Hi! I am exploring how to incorporate imperfect detection into abundance joint species distribution modelling after reading Tobler et al. (2019), which then led me to Yamaura et al. (2012). This may be partly related to Spatial capture-recapture model or State-space models in greta .

Below are two code chunks, one to simulate abundance data and another to fit model (apologies if they look ugly!) To jump ahead, I ran into an error while trying to implement Yamaura et al. (2012)'s model:

Error: model contains a discrete random variable that doesn't have a fixed value, so cannot be sampled from

I think it is because I cannot use poisson() for unobserved variable (commented in code below)? This is also hinted in the helpfile. Has anyone come up with a alternative solution? It would be very helpful and thanks in advance!

Simulate data

# Simulate data to test fit_model.R script

# Following the data generative process in 
# Yamaura et al. (2012) Biodiversity and Conservation

set.seed(123)

# number of species
n_spp  <- 10   
# number of sites  
n_site <- 100 
# number of sampling events
t <- 10

# probability of detection for each species
# assuming the same p_j across sites for now
alpha0_true <- rnorm(n_spp, 2, 1)
mu_true     <- alpha0_true
sigma_true  <- rexp(n_spp, 2)
p_true      <- numeric(n_spp)
for (j in seq_len(n_spp)) {
    logit_p_true <- rnorm(1, mu_true, sigma_true)
    p_true[j] <- exp(logit_p_true) / (exp(logit_p_true) + 1)
}

# true abundance
# using Poisson for now (explore neg. binom. later)
beta0_true  <- rnorm(n_spp, 2, 1)
lambda_true <- exp(rep(1, n_site) %*% t(beta0_true))
N_true <- matrix(rpois(n_site * n_spp, lambda_true), n_site, n_spp)
    
# observed 
n_true <- array(NA, dim = c(n_site, n_spp, t))
for (i in seq_len(n_site)) {
    for (j in seq_len(n_spp)) {
        n_true[i, j, ] <- rbinom(t, N_true[i, j], p_true[j])
    }
}

Fit model

library(greta)

# source("code/sim_data.R")

# data
n <- as_data(n_true)

# variables
X <- ones(n_site)
V <- ones(n_site)

# priors
sd_beta0 <- exponential(2)
beta0    <- normal(0, sd_beta0, dim = n_spp)
lambda   <- exp(X %*% t(beta0))
N        <- poisson(lambda)    # I think this doesn't work as per helpfile
N <- do.call(abind, c(replicate(t, N, simplify = FALSE), along = 3))

sd_alpha0   <- exponential(2)
alpha0      <- normal(0, sd_alpha0, dim = n_spp)
mu          <- V %*% t(alpha0)
sigma       <- exponential(2, dim = n_spp)
sigma <- do.call(rbind, replicate(n_site, t(sigma), simplify = FALSE))
# p           <- ilogit(multivariate_normal(mu, sigma))
p           <- ilogit(normal(mu, sigma))
p <- do.call(abind, c(replicate(t, p, simplify = FALSE), along = 3))

# observational model
distribution(n) <- binomial(N, p)
mod <- model(N, p)

hrlai · April 11, 2021, 3:25am

Just a follow up. I think what I tried to do above is fundamentally against HMC sampling as stated here in the vignette:

A downside of HMC is that it can’t be used to sample discrete variables (e.g. integers), so we can’t specify a model with a discrete distribution (e.g. Poisson or Binomial), unless it’s in the likelihood. If we try to build such a model, greta will give us an error when we run model() . Future versions of greta will implement a wider range of MCMC samplers, including some for discrete variables.

I guess I just need to wait for future greta dev?

Vfeagins · July 9, 2021, 11:11pm

I am trying to develop a model I have running in JAGS in greta and I have also encountered this error. I am trying to implent the model from Wheeler et al. (2019)
Here is the JAGS code used in the paper

model{
  a ~ dnorm(mean.a,p.a) I(0,)
  c ~ dbeta(alpha.c,beta.c)
  k ~ dnorm(mean.k,p.k)
  prec ~ dgamma(s1,s2)
  alp ~ dunif(1,100)
  bet ~ dunif(1,100)
  p.cloud ~ dunif(0,1)
  for(i in 1:n){
  muL[i] <- -a * exp(-1 * (x[i]-k)) + c + a
  muR[i] <- -a * exp((x[i]-k)) + c + a
  f[i] <- ifelse(x[i]>k,muR[i],muL[i])   #change point process model
  y[i] ~ dnorm(mu[i],prec)   ##data model
  is.cloudy[i] ~ dbern(p.cloud)
  trans[i] ~ dbeta(alp,bet)
  mu[i] <- is.cloudy[i] * trans[i]*f[i] + (1-is.cloudy[i]) * f[i]

Here is my attempted greta code to translate this

x <-  as_data(data$x)
y <-  as_data(data$y)

a <- normal(mean = data$mean.a, sd = sqrt(1/data$p.a), truncation = c(0, Inf))

c <- beta(data$alpha.c, data$beta.c)

k <- normal(data$mean.k, sd = sqrt(1/data$p.k))

prec <-  gamma(data$s1, data$s2)

alp <-  uniform(1,100)

bet <- uniform(1,100)

p.cloud <-  uniform(0,1)


muL <- -a * exp(-1 * (x-k)) + c + a

muR <- -a * exp((x-k)) + c + a



f <- (x>k * muR) + (x>k * muL)

is.cloudy <- bernoulli(p.cloud)

trans <-  beta(alp, bet)

mu <- is.cloudy* trans*f + (1-is.cloudy) * f

distribution(y) <- normal(mu, sd = sqrt(1/prec))


m <- model()

How to solve this issue. Or do I just stick with JAGS?

hrlai · July 11, 2021, 4:09am

Hi @Vfeagins good to know I’m not struggling alone

I don’t have a solution for you/us, but I just saw that sampling discrete variable is still part of the near-term plan mentioned here (see section #4).

That said, some potential solutions may lie here. I’m still chewing on Nick’s code to understand how it may work.