Greta Example not working. Error on mcmc

Vfeagins · July 9, 2021, 11:32pm

I have tried the greta example found here https://greta-stats.org/articles/example_models.html#bugs-models

library(greta)
y <- c(21, 20, 15)
n <- c(48, 34, 21)
Z <- c(10, 30, 50)
alpha <- 4.48
beta <- 0.76
sigma2 <- 81.14
sigma <- sqrt(sigma2)
tau <- 1 / sigma2
J <- 3

theta <- normal(0, 32, dim = 2)
mu <- alpha + beta * Z
X <- normal(mu, sigma)
p <- ilogit(theta[1] + theta[2] * X)
distribution(y) <- binomial(n, p)

m<- model()


draws <- mcmc(m)

When I run the mcmc function I get the following error.
Error in py_call_impl(callable, dots$args, dots$keywords) :
RuntimeError: Evaluation error: ValueError: Tensor(“mcmc_sample_chain/trace_scan/while/Const:0”, shape=(2,), dtype=int32) must be from the same graph as Tensor(“leapfrog_integrate_one_step/add:0”, shape=(None, 5), dtype=float64) (graphs are <tensorflow.python.framework.ops.Graph object at 0x2ab4b3fe3f60> and FuncGraph(name=mh_one_step_hmc_kernel_one_step_leapfrog_integrate_while_body_2016, id=46955868100424))…

I am not really sure what this error means.
I am on my universities computer cluster using a Linux virtual environment…

Vfeagins · July 9, 2021, 11:38pm

Some of my specs:
Python Version: Python 3.6.8
Python Packages:
tensorflow 2.5.0
tensorflow-probability 0.7.0

R sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /share/pkg.7/r/4.0.5/install/lib/libopenblasp-r0.3.7.so
LAPACK: /share/pkg.7/r/4.0.5/install/lib/liblapack.so.3.9.1

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices datasets utils methods base

other attached packages:
[1] greta_0.3.1

loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 parallelly_1.26.1 magrittr_2.0.1 whisker_0.4 hms_1.1.0 progress_1.2.2
[7] lattice_0.20-41 R6_2.5.0 rlang_0.4.11 globals_0.14.0 tools_4.0.5 parallel_4.0.5
[13] grid_4.0.5 png_0.1-7 coda_0.19-4 tfruns_1.5.0 ellipsis_0.3.2 digest_0.6.27
[19] lifecycle_1.0.0 crayon_1.4.1 tensorflow_2.5.0 Matrix_1.3-2 base64enc_0.1-3 vctrs_0.3.8
[25] codetools_0.2-18 compiler_4.0.5 prettyunits_1.1.1 reticulate_1.20 jsonlite_1.7.2 future_1.21.0
[31] renv_0.13.2 listenv_0.8.0 pkgconfig_2.0.3

Also I am only have cpu’s currently in the computer cluster no gpus

sreedatta · August 16, 2021, 11:29pm

@Vfeagins Hi Victor, the way you have defined your model is incorrect. When you define a Bayesian model, you are defining that using the parameters specified.

Here your model parameters are theta[1], your intercept, theta[2], your beta coefficient and X your independent variable.

If you examine and read the Winbugs model notes for Air carefully, you will see that Z is the covariate representing the NO2 concentration the bedrooms. In other words Z is the measurement of the Independent variable X with measurement error. That is the reason why you specify
mu <- alpha + beta*Z and in-turn you specify X as
X <- normal(mu, sigma)

In this manner, X has now become your stochastic, independent variable

You are modeling the probability that a child has a respiratory illness using
p <- ilogit(theta[1] + theta[2] * X)

So your model statement has to be:
mod <- model(theta[1], theta[2], X)

If you do not specify precisely the quantity and the input parameters and the covariates (the X variables), greta will not know what to model.

So your complete correct code has to be the following:

library(reticulate)
library(greta)
y <- c(21, 20, 15)
n <- c(48, 34, 21)
Z <- c(10, 30, 50)
alpha <- 4.48
beta <- 0.76
sigma2 <- 81.14
sigma <- sqrt(sigma2)
tau <- 1 / sigma2
J <- 3

theta <- normal(0, 32, dim = 2)
mu <- alpha + beta * Z # this is deterministic
X <- normal(mu, sigma) # with this you make X a random variable
p <- ilogit(theta[1] + theta[2] * X) # this is your Logistic regression
distribution(y) <- binomial(n, p) # this defines the Y, your original dependent variable

mod <- model(theta[1], theta[2], X) 
draws <- mcmc(mod, n_samples = 1000)
summary(draws) 
# to get a summary table of the parameters and compare to Winbugs output
# the results you get are very similar to the results in the Winbugs output

library(bayesplot)
mcmc_trace(draws) 
# to visually inspect the posterior of the parameters
# to visually inspect the how well the draws mixed
# the trace of each variable / parameter should look like a catepillar

If you run correctly, the your output should look similar to what I got below (will not match exactly as this is a Bayesian model)

Iterations = 1:1000
Thinning interval = 1 
Number of chains = 4 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

             Mean      SD  Naive SE Time-series SE
theta[1] -1.04189 1.16814 0.0184699       0.236719
theta[2]  0.04977 0.03967 0.0006273       0.006324
X[1,1]   13.04112 8.90381 0.1407816       1.736653
X[2,1]   27.53257 7.38022 0.1166915       0.447116
X[3,1]   41.11623 8.56493 0.1354234       0.313373

2. Quantiles for each variable:

              2.5%      25%      50%      75%   97.5%
theta[1] -4.728303 -1.35201 -0.71607 -0.31715  0.2348
theta[2]  0.004259  0.02476  0.03896  0.06148  0.1701
X[1,1]   -6.672637  7.14481 14.51069 19.26854 28.3531
X[2,1]   13.112142 22.70398 27.43145 32.35168 42.4910

All the best with your Bayesian adventure with greta.