Installing greta - new script to overcome reticulate's auto_configure

Here is some R-code that seems to reliably install greta on both MAC and PC. If you encounter any issues, please comment here. I have lots of experience getting installs working, so I am happy to help. Here is the script (run it interactively and slow one line at a time):

## INSTALLATION SCRIPT TO GET GRETA, CAUSACT,
## and TENSORFLOW ALL WORKING TOGETHER HAPPILY

## NOTE:  Run each line one at a time using CTRL+ENTER.
##        Await completion of one line
##        before running the next.
##        If prompted to "Restart R", say YES.

#### STEP 0:  Restart R in a Clean Session
#### use RStudio menu:  SESSION -> RESTART R

#### STEP 1: INSTALL PACKAGES WITH PYTHON DEPENDENCIES
install.packages("reticulate",dependencies = TRUE)
install.packages("greta",dependencies = TRUE)
install.packages("causact",dependencies = TRUE)

#### STEP 2: INSTALL & UPDATE MINICONDA SO R CAN FIND PYTHON 
## install miniconda in default location if possible
condaInstall = try(reticulate::install_miniconda())
condaPath = try(reticulate::miniconda_path())
## if ERROR is due to a previous installation, then ignore the error.
## if install fails due to a space in your path, then uncomment
## the below two lines and run them.  
## condaPath = file.path("/", "miniconda")
## reticulate::install_miniconda(path = condaPath,force = TRUE)}

#### STEP 3: Add environment variable so that 
#### reticulate does not attempt to automatically configure
#### a python environment for you
rEnvPath = file.path("~", ".Renviron")
envLines = c()  ## init blank lines
if (file.exists(rEnvPath)) {
  envLines = readLines(rEnvPath)# get rProfile
}
## add new line to bottom of file
newLine = 'RETICULATE_AUTOCONFIGURE = "FALSE"'
envLines = c(envLines, newLine)
writeLines(envLines, rEnvPath)
## also set line for current session
Sys.setenv(RETICULATE_AUTOCONFIGURE = FALSE)

#### STEP 4:  Update "r-reticulate" CONDA ENVIRONMENT
####          FOR TENSORFLOW
## Install the specific versions of modules
## for the TensorFlow installation via CONDA.
## these next lines may take a few minutes to execute
reticulate::conda_remove("r-reticulate")  #start clean
reticulate::py_config()  # initiate basic r-reticulate config -- ignore any error here
## install other packages and downgrade numpy
reticulate::conda_install(envname = "r-reticulate",
                          packages =
                            c(
                              "python=3.7",
                              "tensorflow=1.14",
                              "pyyaml",
                              "requests",
                              "Pillow",
                              "pip",
                              "numpy=1.16",
                              "h5py=2.8",
                              "tensorflow-probability=0.7"
                            ))

#### STEP 4:  TEST THE INSTALLATION - must restart r
##  **** USE MENU:   SESSION -> RESTART R
library(greta)  ## should work now if you restarted R.. takes a minute
library(causact)
graph = dag_create() %>%
  dag_node("Normal RV",
           rhs =normal(0,10))
graph %>% dag_render()  ## see oval
drawsDF = graph %>% dag_greta() ## see "running X chains..."
drawsDF %>% dagp_plot(densityPlot = TRUE)  ## see plot

#### STEP 5:  RESTART R AND TRY STEP 4 JUST TO ENSURE
#### ALL IS WELL
#### USE MENU:  SESSION -> RESTART R
#### CONGRATS IF IT WORKS.  
1 Like

Thanks so much for this, I can confirm that this works on my 2016 15 Inch Macbook Pro.

One suggestion I would have for editing the R environ, is that you can use the usethis R package to assist, rather than reading + writing the R environ.

# library(usethis)
edit_r_environ()
# add this (without the comment) to the last line:
# RETICULATE_AUTOCONFIGURE="FALSE"

Cheers!

1 Like

Thx for pointing out usethis. I always see it talked about and never quite know when to use it.

For the above usecase though, I find users are not always that good at reading and following the instructions I have commented out; adding a line and saving a file might get missed. Restarting the R session comment is often overlooked :-). By completely automating the .Renviron edit, this removes a point of failure due to speed-reading users.

I understand what you mean - in this case users can skim through and miss important steps. I’m hopeful now that with the changes in https://github.com/greta-dev/greta/pull/359, this should help resolve some installation issues. If you want to try it out, you could try:

remotes::install_github("greta-dev/greta#359")

Regarding the Renviron, I think I’ll make a special function to help with this that copies the necessary part to paste into .Renviron to help with this. My reasoning is that although I trust the code you have written, my personal preference is that editing the .Renviron should be an inherently interactive task as it can contain very precious keys that if written over might be lost if something goes awry.

Thanks for sharing! I just started with greta and I am having problems getting it to run properly. Are there instructions similar to these that do not rely on conda installation of python? I am using homebrew on my Mac and would prefer just using python3 and pip3 that come with it, rather than having to install another package manager to just run greta.

The error I encounter is when I draw from the posterior:

I get a “seed” error with the cran version of greta and I get the following error with the dev version of greta:

Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: Expected integer, got dtype <class ‘float’>.

Thanks for your help!

FYI

tensorflow::tf_config()
reticulate::py_module_available(“tensorflow”)
reticulate::py_module_available(“tensorflow_probability”)

TensorFlow v2.5.0 (/usr/local/lib/python3.9/site-packages/tensorflow)
Python v3.9 (/usr/local/bin/python3)
[1] TRUE
[1] TRUE

There will be three challenges you need to overcome to not use the above install script.

  1. Set up a virtual environment with the packages and exact versions specified here:

I see that your reticulate is finding Tensorflow 2.5 and Python 3.9. So either the virtual environment is incorrect or reticulate is not finding the environment you want to use. See here for maybe using virtualenv instead of conda: https://rstudio.github.io/reticulate/articles/python_packages.html

  1. Shut off reticulate's auto configure (pasted from above)
#### STEP 3: Add environment variable so that 
#### reticulate does not attempt to automatically configure
#### a python environment for you
rEnvPath = file.path("~", ".Renviron")
envLines = c()  ## init blank lines
if (file.exists(rEnvPath)) {
  envLines = readLines(rEnvPath)# get rProfile
}
## add new line to bottom of file
newLine = 'RETICULATE_AUTOCONFIGURE = "FALSE"'
envLines = c(envLines, newLine)
writeLines(envLines, rEnvPath)
## also set line for current session
Sys.setenv(RETICULATE_AUTOCONFIGURE = FALSE)
  1. Get reticulate to point to the environment you just set up. See the reticulate documentation for how to customize this: https://rstudio.github.io/reticulate/articles/versions.html

Last tip: Restart R with every attempt to get this working. If reticulate binds to a Python version, it will not unbind itself until you restart.

Hope this helps, I am not very familiar with homebrew to give more explicit guidance.

Thanks! This definitely helps… if I get it to work I will post here what I was able to figure out.

I got it to work by:

  1. brew install python@3.7
  2. pip3 install virtualenv
  3. cd ~/.virtualenvs
  4. virtualenv -p /usr/local/opt/python@3.7/bin/python3 greta
  5. source greta/bin/activate
  6. pip3 install <specific versions of required packages as noted in the reply above>
  7. deactivate

Then, in my R script:

library(reticulate)

use_virtualenv("greta", required = TRUE)

library(greta)
library(bayesplot)

observed <- as_data

Hi @ajf

How should I modify the above script if I have an existing Anaconda installation? I had created a separate conda environment for TensorFlow to be used by Greta, but could not get Reticulate to connect. I could not run the basic Iris regression example. I keep getting this error: (I have installed reticulate, greta, tensorflow, causact, and DiagrammeR in R as well as the specific Python=3.7 and other relevant packages in the new gretf environment I created in Anaconda)

> library(greta)
> data(iris)
> # data
> x <- as_data(iris$Petal.Length)
Error: 

This version of greta requires TensorFlow v1.14.0 and TensorFlow Probability v0.7.0, but TensorFlow Probability isn't installed. To install the correct versions do:

  install_tensorflow(
    method = "conda",
    version = "1.14.0",
    extra_packages = "tensorflow-probability"
  )

Thanks for your help - Sree

Session -> Restart R. (most reticulate functions will bind the session to a python environment. This can only be changed by restarting.)

then:

Sys.setenv(
  RETICULATE_PYTHON = "/Users/lernerguest/Library/r-miniconda/envs/r-tensorflow/bin")
library("greta")

where the path above is replaced by the folder location of your Python executable in your new environment. You can run reticulate::py_config() to verify that reticulate is using the right environment.

If it works, you will need to run the Sys.setenv... line prior to running library(greta) every time or set this in your .Rprofile or .Renviron (preferred) file.

It is easier to just configure the default r-reticulate environment if greta is the only reason you access Python from R.

@ajf - thank you very much for the quick reply. I’m on Windows 10 using Anaconda. So would my Reticulate_Python be (my env name is gretf)

Sys.setenv(
  RETICULATE_PYTHON = "C:/ProgramData/Anaconda3/envs/gretf/Lib/site-packages/r-tensorflow/bin")

In addition, I have finished reading your book and I quite like it. I really enjoyed the theme of the book serving as a guide to setup a Bayesian workflow. I have recommended your book to my colleagues who are actively working in the modeling space.

Sree

On Windows, type conda env list at the anaconda command prompt. This will give you the path to the conda environment. Within that directory or one of its subfolders (e.g. bin), there is a python executable. It is the path of the FOLDER that contains the python executable that you want to set as the environment variable. I doubt that your path includes r-tensorflow in it. Anywhere you see me write r-tensorflow, your equivalent would be gretf.

For your specific problem, you can see more of a write-up about it here (section 15.5.2):

Thanks for the kind words about the book and the recommendation of it to your colleagues. :pray: I am trying to raise awareness of the book as it is now battle-tested in the classroom and very-well received.

@ajf thanks again for the detailed note. Very few books that I have seen setup a consistent method as yours did with R, tidyverse, causact, and greta. I have been using R for 15 years and been in analytic consulting and teaching for over 22 years. I did not realize you were the author of the book I had read and that you were active here. I learned about greta literally two days ago. I have been using WinBugs, JAGS, Stan, and Pymc3 for the longest time. Your book is a wonderful introduction to R with analytical workflow as a key theme.

1 Like

Hi Adam @ajf - success at last with my greta setup using Anaconda, reticulate, and. tensorflow .- Thank you!!! Now I can begin my greta exploration in earnest.

The simple Iris regression example worked nicely. I’m going to write up a how-to-install-and-configure document and post it in the forum for Windows 10 and Anaconda users

2 Likes