Eventually, this document may be split into several parts, dedicated to individual components - such as R, rApache and the TwoRavens applications. Particularly, if the TwoRavens team creates an “official” distribution with their own installation manual.
yum install httpd
Disable SELinux on httpd:
setenforce permissive
getenforce
https strongly recommended (required?); signed cert recommended.
yum install R R-devel
(EPEL distribution recommended; version 3.* required; 3.1.* recommended as of writing this)
rpm distribution from the HMDC systems group is recommended; download and install the latest available rpm (1.2.6, as of writing this):
http://mirror.hmdc.harvard.edu/HMDC-Public/RedHat-6/rapache-1.2.6-rpm0.x86_64.rpm
(provides /usr/bin/curl-config, needed by some 3rd-party R packages; package installation will fail silently if it’s not found!):
yum install libcurl-devel
Make sure you have the standard GNU compilers installed (needed for 3rd-party R packages to build themselves).
R is used both by the Dataverse application, directly, and the TwoRavens companion app.
Two distinct interfaces are used to access R: Dataverse uses Rserve; and TwoRavens sends jobs to R running under rApache using Rook interface.
We provide a shell script (conf/R/r-setup.sh in the Dataverse source tree; you will need the other 3 files in that directory as well - https://github.com/IQSS/dataverse/conf/R/r-setup.sh) that will attempt to install the required 3rd party packages; it will also configure Rserve and rserve user. rApache configuration will be addressed in its own section.
The script will attempt to download the packages from CRAN (or a mirror) and GitHub, so the system must have access to the internet. On a server fully firewalled from the world, packages can be installed from downloaded sources. This is left as an exercise for the reader. Consult the script for insight.
and rename the resulting directory dataexplore. Place it in the web root directory of your apache server; so that it is visible from the outside at
https://<rapache server>:<rapache port>/dataexplore
We’ll assume /var/www/html/dataexplore in the examples below.
a scripted, interactive installer is provided at the top level of the TwoRavens distribution (https://github.com/IQSS/TwoRavens/blob/master/install.pl). Run it as
./install.pl
The installer will ask you to provide the following:
Setting | default | Comment |
---|---|---|
TwoRavens directory | /var/www/html/dataexplore | File directory where TwoRavens is installed. |
Apache config dir. | /etc/httpd | rApache config file for TwoRavens will be placed under conf.d/ there. |
Apache web dir. | /var/www/html | |
Apache host address | local hostname | |
Apache host port | 443 | |
Apache web protocol | https | http or https (https recommended) |
Dataverse URL | URL of the Dataverse from which TwoRavens will be receiving metadata and data files. For example, https://thedata.harvard.edu. |
This should be it!
Explained below are the steps needed to manually configure TwoRavens to run under rApache (these are performed by the install.pl script above). Provided for reference.
Edit the file /var/www/html/dataexplore/app_ddi.js.
find and edit the following 3 lines:
var production=false;
and change it to true;
hostname="localhost:8080";
so that it points to the dataverse app, from which TwoRavens will be obtaining the metadata and data files. (don’t forget to change 8080 to the correct port number!)
and
var rappURL = "http://0.0.0.0:8000/custom/";
set this to the URL of your rApache server, i.e.
"https://<rapacheserver>:<rapacheport>/custom/";
rApache is a loadable httpd module that provides a link between Apache and R. When you installed the rApache rpm, under 0., it placed the module in the Apache library directory and added a configuration entry to the config file (/etc/httpd/conf/httpd.conf).
Now we need to configure rApache to serve several R “mini-apps”, from the R sources provided with TwoRavens.
in dataexplore/rook:
rookdata.R, rookzelig.R, rooksubset.R, rooktransform.R, rookselector.R, rooksource.R
and replace every instance of production<-FALSE line with production<-TRUE.
(yeah, that’s why we provide that installer script...)
and change the following line:
setwd("/usr/local/glassfish4/glassfish/domains/domain1/docroot/dataexplore/rook")
to
setwd("/var/www/html/dataexplore/rook")
(or your dataexplore directory, if different from the above)
url <- paste("https://dataverse-internal.iq.harvard.edu/custom/preprocess_dir/preprocessSubset_",sessionid,".txt",sep="")
and
imageVector[[qicount]]<<-paste("https://dataverse-internal.iq.harvard.edu/custom/pic_dir/", mysessionid,"_",mymodelcount,qicount,".png", sep = "")
and change the URL to reflect the correct location of your rApache instance - make sure that the protocol and the port number are correct too, not just the host name!
(This configuration is now supplied in its own config file tworavens-rapache.conf, it can be dropped into the Apache’s /etc/httpd/conf.d. Again, the scripted installer will do this for you automatically.)
RSourceOnStartup "/var/www/html/dataexplore/rook/rooksource.R"
<Location /custom/zeligapp>
SetHandler r-handler
RFileEval /var/www/html/dataexplore/rook/rookzelig.R:Rook::Server$call(zelig.app)
</Location>
<Location /custom/subsetapp>
SetHandler r-handler
RFileEval /var/www/html/dataexplore/rook/rooksubset.R:Rook::Server$call(subset.app)
</Location>
<Location /custom/transformapp>
SetHandler r-handler
RFileEval /var/www/html/dataexplore/rook/rooktransform.R:Rook::Server$call(transform.app)
</Location>
<Location /custom/dataapp>
SetHandler r-handler
RFileEval /var/www/html/dataexplore/rook/rookdata.R:Rook::Server$call(data.app)
</Location>
mkdir --parents /var/www/html/custom/pic_dir
mkdir --parents /var/www/html/custom/preprocess_dir
chown -R apache.apache /var/www/html/custom
service httpd restart