Constraining a parameter (not working apparently)

clquezada · June 6, 2022, 1:47pm

Hello everyone

I am somewhat new to greta, and I would like to calculate a simple mixing model for stable isotope ecology (I’m implementing a new model for trophic position estimation in greta, formerly calculated with BUGS: https://github.com/clquezada/tRophicPosition).

The thing is that I’m trying to calculate the following:

library(greta)

# simulate some data
dCb1 <- rnorm(25, -15)
dCb2 <- rnorm(25, -10)
dCc <- rnorm(25, -12.5)

#vague priors
muCb1 <- normal(0, 1)
sdCb1 <- cauchy(0, 3, truncation = c(0, Inf))
muCb2 <- normal(0, 1)
sdCb2 <- cauchy(0, 3, truncation = c(0, Inf))

muCc <- normal(0, 1)
sdCc <- cauchy(0, 3, truncation = c(0, Inf))

distribution(dCc) <- normal(muCc, sdCc)
distribution(dCb1) <- normal(muCb1, sdCb1)
distribution(dCb2) <- normal(muCb2, sdCb2)

# here is the important thing. I want to constrain alpha to [0 - 1]
# that's why i'm using a beta distribution
alpha <- beta(2, 2)

# and this is the likelihood
alpha <- (muCc - muCb2)/(muCb1 - muCb2)

m <- model(alpha, muCc, muCb1, muCb2)
plot(m)

draws <- greta::mcmc(m, n_samples = 10000,
                 n_cores = 16, chains = 16,
                 thin = 10)

summary(draws)

These are the results with vague priors:

Iterations = 1:9991
Thinning interval = 10 
Number of chains = 16 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:

     Mean     SD Naive SE Time-series SE
alpha  0.9216 0.9872 0.007804       0.007874
muCc  -2.6155 1.1673 0.009229       0.010678
muCb1 -1.9580 1.0662 0.008429       0.009692
muCb2 -9.4207 0.4181 0.003306       0.004321

2. Quantiles for each variable:

    2.5%     25%     50%    75%   97.5%
alpha  0.561  0.7826  0.9126  1.059  1.4100
muCc  -4.928 -3.3964 -2.6030 -1.831 -0.3600
muCb1 -4.022 -2.6833 -1.9576 -1.229  0.1067
muCb2 -9.910 -9.6425 -9.4775 -9.283 -8.6525

As you see, although I’m using a beta distribution on alpha, the posterior estimates are over the upper limit, but the calculation with the means of raw data is ~ 0.5 :

(mean_alpha <- (mean(dCc) - mean(dCb2))/(mean(dCb1) - mean(dCb2)))
[1] 0.5004578

Then I changed the priors of the observed data distributions and the alpha estimation changed to expected values, but still I’m wondering how can I effectively constrain a variable (in this case alpha) to be exactly between the limits my model expect. This is the final working code:

#informative priors
muCb1 <- normal(mean(dCb1), 1)
sdCb1 <- cauchy(0, 3, truncation = c(0, Inf))
muCb2 <- normal(mean(dCb2), 1)
sdCb2 <- cauchy(0, 3, truncation = c(0, Inf))

muCc <- normal(mean(dCc), 1)
sdCc <- cauchy(0, 3, truncation = c(0, Inf))

distribution(dCc) <- normal(muCc, sdCc)
distribution(dCb1) <- normal(muCb1, sdCb1)
distribution(dCb2) <- normal(muCb2, sdCb2)

alpha <- beta(2, 2)

alpha <- (muCc - muCb2)/(muCb1 - muCb2)

m <- model(alpha, muCc, muCb1, muCb2)
plot(m)

draws <- greta::mcmc(m, n_samples = 10000,
                 n_cores = 16, chains = 16,
                 thin = 10)

summary(draws)

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

      Mean      SD  Naive SE Time-series SE
alpha   0.5176 0.05511 0.0004356      0.0006447
muCc  -12.7067 0.26947 0.0021303      0.0033346
muCb1 -15.2635 0.16345 0.0012922      0.0014247
muCb2  -9.9640 0.15197 0.0012014      0.0013312

2. Quantiles for each variable:

      2.5%      25%      50%      75%    97.5%
alpha   0.4085   0.4812   0.5176   0.5541   0.6256
muCc  -13.2329 -12.8881 -12.7072 -12.5285 -12.1718
muCb1 -15.5874 -15.3714 -15.2645 -15.1545 -14.9456
muCb2 -10.2641 -10.0656  -9.9639  -9.8650  -9.6627

And now alpha posterior is around 0.5 (as expected) but not sure if this is because the informative priors or given the limits of beta distribution are actually constraining this variable. Should I do something to constrain properly a variable? Perhaps I’m missing something?

Thank you,

Claudio

njtierney · June 7, 2022, 9:30am

Hi Claudio,

Thanks so much for posting this, I’ll get back to you as soon as I can, just need some time to think about this.

njtierney · June 9, 2022, 7:40am

Hi @clquezada!

Thanks again for your question,

Do you have a specific part of the BUGS/JAGS code that you are trying to replicate?

At the moment, the main issue is that you are overriding alpha, with:

alpha <- beta(2, 2)

alpha <- (muCc - muCb2)/(muCb1 - muCb2)

As this will make alpha be (muCc - muCb2)/(muCb1 - muCb2), and removes the previous line, in a similar way to

x <- 1
x <- "one"

the variable x will contain “one”, and not 1

If you can point us to a specific part of the modelling code in BUGS/JAGS that you’re wanting to replicate - is it something around here? https://github.com/clquezada/tRophicPosition/blob/9516b237a56b3f6abd14f04295d229da82d825a2/R/jagsTwoBaselines.R#L289

Cheers!

clquezada · June 10, 2022, 11:53pm

Hi Nick

Thank you very much for taking the time to look inside the code, much appreciated.

Yes, you are right regarding to what I want to do, but I didn’t explain it correctly. I want alpha to have a beta distribution (thus limited between 0 and 1) and the calculation of it, to be

 alpha <- (muCc - muCb2)/(muCb1 - muCb2)

All the variables (muCc, muCb1 and muCb2) are normal mean estimations based on actual data.

I tried to define the distribution of alpha using

distribution(alpha) <- beta(2, 2)

But it returns an error: “Error: distributions can only be assigned to data <greta array’>s”. Although it is actually a greta array.

Any comment much appreciated, thanks.

Cheers

Claudio

njtierney · June 13, 2022, 2:47am

Hi Claudio,

Do you have examples of this kind of constraint and calculation in JAGS/BUGS/stan or a paper?