Alternating minimisation of variable parameters with opt

vitaliikl · August 4, 2019, 5:17pm

Hi Nick and others,

I would like to use alternating minimisation of parameters as in this PCA example or this stackoverflow discussion.
There people suggest specifying the list of variable to minimise via var_list: train_W = optimizer.minimize(loss, var_list=[W]). I cannot find where greta::opt executes this operation, though. I suppose one can easily add the extra sess.run(train_W) steps in object$run_minimiser as several calls to self$model$dag$tf_sess_run.
The user interface for specifying alternating minimisation could be adding list argument to opt() where each element tells which greta variables should be updated during each step of alternating minimisation.
Would it be straightforward to implement something like this and where might I add equivalent operation train_W = optimizer.minimize(loss, var_list=[W]) in opt()?

Thanks,
Vitalii

nick · August 6, 2019, 7:51am

Interesting. That would be a handy feature! Might be pretty difficult to implement though.

That step is done in a method in the tf_optimiser class, for the native TensorFlow optimisers, and in the scipy_optimiser class for the other optimisers (though it’s not clear to me what will happen to the interface to these optimisers in TF 2.0)

The way greta works, it won’t be trivial to set this up. greta represents the parameters for every model with a single, unconstrained vector variable. That vector is then split up into vector-versions of the component variables in the model, then transformed and reshaped to apply the necessary constraints. Then those parameters (which are tensors resulting from ops, rather than variable tensors) are used in the likelihood.

To get this to work, greta would have to define the TensorFlow graph in a different way than it currently does. And that new approach would only work when doing optimisation with alternating minimisation, it would have to stay as it is (with the single free state) for most other things, like HMC.

I think this might be one of those things that falls just on the wrong side of greta’s trade-off between simplicity of the high-level R interface, and flexibility of the underlying TensorFlow code. In other words, it would almost certainly be easier to write the model directly in TensorFlow if you really need to do this!