Generalised latent variable model example

Hi Nick/everyone,

I’m looking to implement a generalised linear latent variable model (GLLVM), likes the ones of Hui 2015 in there R package BORAL.I was wondering if Nick, or someone else, could post an example of how one would do this in greta?

The package uses JAGS, so below is an example JAGS model from the spider data in the mvabund package (used the Hui 2016 paper). It has two latent variables, a row effect to control for sequencing depth and counts modelled as a negative binomial.

 model {
	 ## Data Level ## 
	 for(i in 1:n) {
		 for(j in 1:p) { eta[i,j] <- inprod(lv.coefs[j,2:(num.lv+1)],lvs[i,]) + row.coefs.ID1[row.ids[i,1]] }
		 for(j in 1:p) { u[i,j] ~ dgamma(1/lv.coefs[j,num.lv+2], 1/lv.coefs[j,num.lv+2]) }
		 for(j in 1:p) { y[i,j] ~ dpois(exp(lv.coefs[j,1] + eta[i,j ])*(u[i,j])) } ## Parameterizing the NB as a multiplicative random effect models

		 }
	 ## Latent variables ##
	 for(i in 1:n) { for(k in 1:num.lv) { lvs[i,k] ~ dnorm(0,1) } } 

	 ## Process level and priors ##
	 for(j in 1:p) { lv.coefs[j,1] ~ dnorm(0,0.1) } ## Separate species intercepts

	 for(i in 1:n.ID[1]) { row.coefs.ID1[i] ~ dnorm(0,0.1) } 

	 for(i in 1:(num.lv-1)) { for(j in (i+2):(num.lv+1)) { lv.coefs[i,j] <- 0 } } ## Constraints to 0 on upper diagonal
	 for(i in 1:num.lv) { lv.coefs[i,i+1] ~ dnorm(0,0.1)I(0,) } ## Sign constraints on diagonal elements
	 for(i in 2:num.lv) { for(j in 2:i) { lv.coefs[i,j] ~ dnorm(0,0.1) } } ## Free lower diagonals
	 for(i in (num.lv+1):p) { for(j in 2:(num.lv+1)) { lv.coefs[i,j] ~ dnorm(0,0.1) } } ## All other elements
	 for(j in 1:p) { lv.coefs[j,num.lv+2] ~ dunif(0,30) } ## Dispersion parameters

	 }

Thanks for your help,

Chris

Yeah, I’ve played with these a little. I think Francis (author of boral) has been running some big ones using greta too.

Here’s an example based on the examples in the boral::boral() helpfile. Hopefully it’s a good jumping off point for you.

Some things to bear in mind:

  • there’s an non-identifiability issue with these models (which is why boral only runs one chain)
  • the priors for coefficients, loadings and latent factors is the same in this example as in boral, but the negative binomial dispersion parameter isn’t - it’s a guess and might be wildly inappropriate
library(mvabund)
data(spider)
y <- spider$abun
X <- scale(spider$x)

# numbers of observations, species, covariates, and latent variables
n_obs <- nrow(y)
n_sp <- ncol(y)
n_cov <- ncol(X)
n_lv <- 2

# equivalent boral model
spiderfit_nb <- boral::boral(y, X,
                             family = "negative.binomial",
                             lv.control = list(num.lv = n_lv),
                             save.model = TRUE)

library(greta)

# ~~~~
# priors

# NB dispersion - not at all sure about this prior!
size <- normal(0, 10, truncation = c(0, Inf))

# separate species intercepts
int <- normal(0, 10, dim = n_sp)

# regression coefficients
coef <- normal(0, 10, dim = c(n_cov, n_sp))

# loadings matrix, with constraints like in boral
loadings <- zeros(n_sp, n_lv)
diag(loadings) <- normal(0, 10, dim = n_lv, truncation = c(0, Inf))
lower_idx <- lower.tri(loadings)
loadings[lower_idx] <- normal(0, 10, dim = sum(lower_idx))

# latent variables
lvs <- normal(0, 1, dim = c(n_obs, n_lv))

# ~~~~
# model

env <- X %*% coef
fixed <- sweep(env, 2, int, "+")
random <- lvs %*% t(loadings)
eta <- fixed + random
mu <- exp(eta)

# ~~~~
# likelihood

# get negative binomial probability and define likelihood
prob <- 1 / (1 + mu / size)
distribution(y) <- negative_binomial(size, prob)

# ~~~~
# build and fit the model

m <- model(int, size)
draws <- mcmc(m, chains = 64)

# note that a large number of chains is probably a good idea with greta, it'll make
# much more efficient use of your CPUs)
2 Likes

Thanks a lot, I appreciate the quick reply.

Yeah I did read that sign switching can occur across multiple chains.
I will do some experimentation.

Cheers,
Chris