Getting different results in greta and in rethinking


I’m going through exercises from Richard McElreath’s Statistical Rethinking book, and I’m trying to get myself acquainted with greta in the process. I’m having some specific issues with Exercise 12M4 from the book. I’m not sure what I’m doing differently between the rethinking code and the greta code.


chimpanzees$recipient <- NULL
chimpanzees$block_id <- chimpanzees$block

m2 <- map2stan(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a + a_actor[actor] + a_block[block_id] + (bp + bpc * condition) * prosoc_left,
    a_actor[actor] ~ dnorm(mean_actor, sigma_actor),
    a_block[block_id] ~ dnorm(mean_block, sigma_block),
    c(a, mean_actor, mean_block, bp, bpc) ~ dnorm(0, 10),
    c(sigma_actor, sigma_block) ~ dcauchy(0, 1)
  ), data = chimpanzees, warmup = 1000, iter = 6000, chains = 4, cores = 3


The trace plots show a healthy mixing of the chains.

Here’s my attempt in greta:

actor <- as_data(chimpanzees$actor)
block_id <- as_data(chimpanzees$block)
condition <- as_data(chimpanzees$condition)
prosoc_left <- as_data(chimpanzees$prosoc_left)
pulled_left <- as_data(chimpanzees$pulled_left)

a <- normal(0, 10)
a_a <- normal(0, 10)
a_b <- normal(0, 10)
sigma_act <- cauchy(0, 1, truncation = c(0, Inf))
sigma_block <- cauchy(0, 1, truncation = c(0, Inf))
beta_p <- normal(0, 10)
beta_pc <- normal(0, 10)
a_act <- normal(a_a, sigma_act, dim = max(chimpanzees$actor))
a_block <- normal(a_b, sigma_block, dim = max(chimpanzees$block))

chimp_model <- a + a_act[actor] + a_block[block_id] + (beta_p + beta_pc * condition) * prosoc_left
p_chimps <- ilogit(chimp_model)
distribution(pulled_left) <- binomial(1, p_chimps)

chimp_model_results <- model(a, a_a, a_b, a_act, a_block, beta_p, beta_pc)
chimp_model_draws <- mcmc(chimp_model_results)


The trace plots here are brutal. What am I missing in order to replicate the results?


I think the issue is in these two lines:

a_act <- normal(a_a, sigma_act, dim = max(chimpanzees$actor))
a_block <- normal(a_b, sigma_block, dim = max(chimpanzees$block))

In the rethinking book, the mean of these variables is 0 (not a_a or a_b). If you change the mean of these two normally distributed random variables to zero, then the trace plots will look much better. In other words, this works better as a prior and is consistent with the rethinking book:

a_act <- normal(0, sigma_act, dim = max(chimpanzees$actor))
a_block <- normal(0, sigma_block, dim = max(chimpanzees$block))


Thanks for the reply ajf. Yes, in p.373 of his book you’re correct that is how it should be specified. However, in his exercise 12M4 (p.385), the exercise involves letting the means of those variables to have their own priors.


Hi Phil,

When I run the map2stan code (with default stan parameters) I get lots of warnings about divergent transitions and treedepth:

Warning messages:
1: There were 187 divergent transitions after warmup. Increasing adapt_delta above 0.95 may help. See 
2: There were 1173 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See 
3: Examine the pairs() plot to diagnose sampling problems

This plus the interesting wording of the question makes me wonder whether this is designed to be a bad model. This would mean that greta is just showing this in a different (and more violent) way to stan.

Do you think this is a possibility?

All the best!


Hmm, you’re probably correct. There are some other exercises in which he asks to run models that poorly converge to teach how to diagnose and determine why the model is poor. Thanks for bringing that up!