Scala and some other provocative thoughts (sorry)

jhegedus42 · November 2, 2019, 10:09pm

Hi Guys,

Why not use Scala to do this thing?

What is so good about R?

From a software engeneering point of view it is a nightmare.

Which feature of TF are you using? Exactly?

I might reimplement Greta in Scala in one afternoon.

Why R? Is there a reason why you guys are using R, and not Haskell?

How many TF functions are you using in total?

Another question: about discrete variables. Why not approximate them with sigmoids? Then you get discrete variables for free.

I totally like your idea, but I just cannot comprehend what could be the reason to use R ?

The worst language on the planet, which actually is being used by lot of ppl.

I mean, yes, sure if you have legacy code that you have to maintain, then sure, R is nice.

But starting a new project from scratch and basing it on R?

Is there any solid software engeneering reason behind this design decision?

The idea is of using TF is not bad.

But does TF scale to 10.000 nodes?

Am I getting right that you want to generate random numbers from some high dimensional distribution, such that the number of points in a given vortex normalized by the total number of points approximates the prob. distribution?

This is the goal, right?

Cheers,

Jozsef

nick · November 2, 2019, 10:37pm

Hi Jozsef,

greta is implemented in R because R is by far the language most used by statisticians, data scientists and people in allied disciplines.

I agree that some of greta’s implementation might be easier in other languages (the R6 code for example is very python-esque) but I’m pretty sure it’s a much easier task to write greta in R than persuade R users to switch to Scala

Feel free to write a Scala version though. That would be cool, I’d like to see it!

sche · November 2, 2019, 10:41pm

Just a user but IMO

Why not use Scala to do this thing?

Mainly because of R’s popularity among statisticians and domain experts who heavily rely on statistical methods.

I believe, that one of the goals of Greta is to allow high level model development, in R, while minimizing the need to deal with the computational details e.g. edward in python.

Which feature of TF are you using? Exactly?

I can’t really speak to the engineering but I think the entire backend IS TF and TF probability.

Why R? Is there a reason why you guys are using R, and not Haskell?

Most statisticians or domain experts do not know Haskell.

Another question: about discrete variables. Why not approximate them with sigmoids? Then you get discrete variables for free.

It’s possible I guess, but outside of the great context there are reasons to use discrete variables. If you approximate it, you still need to justify that approximation. Also I use things like random graphs to model networks and there’s no continuous approximation to such an object. Not related to greta, but I’m sure there’s some cases that are similar.

I totally like your idea, but I just cannot comprehend what could be the reason to use R ? The worst language on the planet, which actually is being used by lot of ppl.

Mainly because most people using R are concerned with code not as an output in of itself but as a means to an end. It’s the “worst” in a software engineering way but if you measure say, impact on statistical methodology development and in general, the development of an academic field, I’d say R’s hugely hugely successful.

Even if there are no technical reasons to use R, there’s more to designing usable software than just the technical aspects.

jhegedus42 · November 3, 2019, 2:00am

for the fun of it i might

jhegedus42 · November 3, 2019, 2:40am

Hmm… interesting … i might really wanna look into a scala version…

otoh… i wonder if greta uses some sort of symbolic differentiation…

also, thinking about discrete variables, and hmc

discrete variables can have a deterministic equation of motion associated with them such that the total energy stays unchanged, so, that would be the not sigmoid way of doing things

i think this discrete variable problem a very strange problem

neural networks have sigmoids to be able to learn

you say that states which are “in between” the discrete values are forbidden, you assign zere probability to them, you throw them away, as long as detailed balance holds between the discret states, the markov chain will generate the right distributione , and since the equations of motions are symmetric, you can do this, but there might be some useful information if the there is a linear or even non line-transformation beftween the two discreet states , its a different model for sure , but still if the total energy of the system is not changing then that might mean that you are extending your model “in the right way” - you include the only single extra assumption into your model that is allowed to be incluled

i wonder why this technique is not being already widely used in hmc?

is there a fundamental reason why?

i am also wondering about tf

it spends most of its time by calculating differential, derivatives which might or might not be so easily parallelizable…

but for example, if i were to use monix, a distributed async reactive framework for scala, then i could distribut the calculations to 10000 nodes or, according to the differentiation rules, and then combine them in some optimal fashion

since the graph which describes the computation is first class and dynamic and type safe, i am really wondering why not scala is the choice for such calculations?

when thats pretty much where the only usable reactive, async, massively parallel solution can be implemented …

i am even wondering if it were possible to outdo tf by a few lines of scala code, when it comes to scaling to massive datasets…

symbolic differentiation is pretty easy to handle with such reactive streams

caching is take care out of the box…

i mean… yes, for a few gpu-s, etc… a custom built c++ code might be enough, but what if you want to use 1 million cores?

how do you write code for that?

i could guess, that in scala, its not more than a few k lines, but in c++ where u dont have reactive systems, first class support for them… things are not composable… i would not stand a chance…

i dont know anything about this topic, but i have seen recently massive development in this area and i have a gut feeling that even i might be able to write a code that beats tf on one million cores…

jhegedus42 · November 3, 2019, 4:55am

anyway, in some sense mcmc with a sigmoid discrete function is a bit like a neural network

it does “back propagation”, but in an unsupervised way, which is kinda cool

vlandau · November 8, 2019, 9:02pm

I’ll just say that greta has been incredibly useful for me in work that I’m doing, so thank you @nick! I would guess most ecologists wouldn’t be willing to learn another language like Scala just to use one software package (I wouldn’t). I’m glad it’s in R!