Hi, (1) According to the greta examples webpage, variable
puts an unbounded uniform distribution on the parameter. If so, does a constrained one put a bounded uniform distribution on the parameter, or can we use variable(lower=a,upper=b)
and uniform(min=a,max=b)
interchangeably? (2) According to the examples of inference functions, if we use variable
to set flat priors, and use adjust = FALSE
in opt
to skip the jacobian adjustments used in MAP estimation, then we can get the true MLE. I understand that jacobian adjustments are needed for a PDF if its variables are fransformed, e.g. if y\rightarrow g(y) then g'(y) is needed for f[g(y)]. But I’m not clear what the adjustments refer to in this situation in opt
such that they lead to different results. Thanks in advance.
Usage of variable and opt
Hi CYang,
Thanks for your questions!
(1) variable(lower = a, upper = b)
and uniform(min = a, max = b)
can be used interchangeably. The two sets of parameters in the following code will be the same.
library(greta)
x <- rnorm(10)
mu <- variable(lower = -1, upper = 1)
variance <- variable(lower = 0, upper = 10)
distribution(x) <- normal(mu, sqrt(variance))
model1 <- model(mu, variance)
opt_res1 <- opt(model1)
opt_res1$par
mu <- uniform(min = -1, max = 1)
variance <- uniform(min = 0, max = 10)
distribution(x) <- normal(mu, sqrt(variance))
model2 <- model(mu, variance)
opt_res2 <- opt(model2)
opt_res2$par # same as opt_res1$par
In terms of readability though I would go with uniform(min = a, max = b)
as it fits more naturally into the idea of placing distributions on priors.
(2) Here what adjust = TRUE
is doing is making the optimiser use the adjusted log prob from the dag. In this context the adjustment is coming from the bounds placed on parameters (in variable(lower = 0)
). Using adjust = FALSE
ignores these bounds essentially makeing variable(lower = 0)
into variable(lower = 0)
in the model. In code we can see this in the fact that the two sets of parameters in the following snippet will be the same:
mu <- variable()
variance <- variable(low = 0)
distribution(x) <- normal(mu, sqrt(variance))
model3 <- model(mu, variance)
opt_res3 <- opt(model3, adjust = FALSE)
opt_res3$par
mu <- variable()
variance <- variable()
distribution(x) <- normal(mu, sqrt(variance))
model4 <- model(mu, variance)
opt_res4 <- opt(model4, adjust = TRUE) # adjust = TRUE by default
opt_res4$par # same as opt_res3$par
But what is the connection to the MLE? The MLE and the MAP (which opt finds) are only equal if improper flat priors are placed on all parameters. (See here: https://wiseodd.github.io/techblog/2017/01/01/mle-vs-map/ for details). Nick used adjust = FALSE
in the documentation to the find the MLE as it essentially forces all priors to be flat.
Hope that helps - let me know if you have anymore questions.
All the best!
Jeffrey
Thank you so much for the clarification. This brought to my mind another question: to my understanding in greta an assignment to a transformed variable like in BUGS and STAN is not allowed, rather it is realized via an inverse transformation. Does this mean that greta users do not need to worry about jacobian adjustments such as must be manually provided in STAN in general?
Hi Cyang,
Unfortunately I’m not quite sure what you’re getting at. Maybe an example would help?
In general Greta has a higher level interface than Stan and BUGS so details like Jacobian adjustments aren’t so exposed to the user.
All the best!
Well, e.g., a logit-normal distribution can be obtained by replacing x from a log-normal distribution with x/(1-x), adjusted by 1/(1-x)^2. Greta has a built-in distribution for the log-normal, but not yet for the logit-normal. In Stan it can be realized by assigning x/(1-x) ~ lognormal(mu,sigma)
first and then the log-Jacobian via target += -2*log(1-x)
. In greta, if we assign y <- lognormal(meanlog, sdlog)
first and then an inverse transformation x <- y/(1+y)
, will the sample of x
be logit-normal?
To the best of my knowledge greta will not account for that sort of transform but that may well be incorrect - this is not a part of greta I really understand fully. Sorry about that!
I’m sure @nick will be able to give you a better answer.
All the best!
Yup, as you say, greta has a slightly higher-level interface than stan, so the user API doesn’t let users increment the joint log density (target
, in stan) of the model directly.
That’s a deliberate design decision because a) it’s easy to mess up the whole log-Jacobian adjustments thing, and b) I want to motivate people to develop extensions packages for greta, so that new functionality like that can be written once, well, and distributed to other users. The downside of this approach is that it’s a little difficult to just go off-piste when hacking around on a single model.
So yes, in greta you can’t do that log-normal -> logit-normal trick because you can’t adjust the log jacobian; you’d have to define a new probability distribution using code like this and this. Though unless I’ve missed something, in practice you can achieve the same behaviour with: x <- ilogit(normal(mean, sd)
.
For readers of this post who don’t want to do anything funky and are worried about log-Jacobian adjustments: the adjustments issue only comes up when the inference algorithm (the optimiser or sampler) is acting on a parameter, but there are transformations applied to that parameter before the density is defined on it. Like stan, greta shows unconstrained parameters to the optimiser/sampler and makes those adjustments so that the joint density is correct. Because greta doesn’t allow you to adjust the joint density, or define densities on variables or operations (you can only define distributions on data), you can’t get into trouble applying transformations to variables in greta.