Skip to content

ESGFNode|InstallationProcedure

Matthew Harris edited this page Oct 8, 2013 · 26 revisions

P2P Node Installation

** The definitive and complete source for installation information is here . **

This page contains information extending that documentation.

The installation script is written in ** BASH ** and is targeted to run on *NIX operating systems. In particular the target platform is RH or CentOS, because of minor command idiosyncrasies, however, the script has been used successfully executed on other *NIX systems. The install script pulls down installation artifacts from our distribution server. ** All ** artifacts and even the scripts themselves are verified against posted checksums. If verification fails the script will abort. There is a single master script, however, there is actually a series of scripts that are used to install the ESGF Node stack. The script is able to run at start up via chkconfig, supporting [start|stop|restart|update] directives. The script is also self- updating and will alert you if it has been tampered with. There are many flags that may be used with the script (see usage by using --help|-h options), as it is the central mechanism for controlling the maintenance life cycle of the data node.

The script is the preferred way to install the ESGF Node stack. It is recommended to create an "esg-user" account that is able to issue the command "sudo -s" to become root. More information on the details of the installation process is available from the distribution site and the installer project site .

Procedure

Execute the installer with the command:

%> esg-node --type data --install --verify
(this will install a "data" configuration of the esgf node)

.

This will perform all installation and verification steps in sequence, prompting the user each time during an initial install. Order of installation is important! If the script aborts at any point do not continue until the issue at hand has been resolved. The general sequence is as follows:

  1. ** Basic Tools & Utilities: ** The installation script depends on some fundamental tools and utilities. The script will check for these utilities and, as it does with all modules, it also checks their versions against a prescribed minimum version. Tools and Utils
* Curl 
* GIT 
* Java 
* Ant 
  1. ** Install Postgres: ** Postgres will be downloaded and installed in /usr/local/pgsql . You will need to supply a database admin username/password and a password for the esgcet database user. ESGFNode/ScriptDetails/Postgres

    • ** Install ESGCET: ** The esgcet Python package (also known as the "Publisher") will be downloaded and installed into the CDAT installation. You will be asked to supply a default gateway and list of data directories. A test dataset will be scanned and ingested into the publisher's postgres database. Though you may supply any number of directories, it is recommended that you use one (the presented default) and manage the file system below that location.
  2. ** Install tomcat: ** Apache tomcat will be downloaded and installed into /usr/local/apache-tomcat-<version> and symlinked "Tomcat". Tomcat will be configured to listen on HTTP and HTTPS. You will be asked to supply various usernames and passwords, including one for the SSL keystore.

    • ** Install THREDDS Data Server (TDS): ** The TDS WAR will be downloaded and installed into tomcat. You should verify TDS is accessible via HTTP and HTTPS.
    • ** Configure TDS: ** Perform local configuration of TDS. A THREDDS catalog will be written for the test dataset. This should be visible at the TDS URL.
  3. ** Install Node Manager: ** The node manager application that monitors and interconnects nodes and services. Also collects node usage metrics, system performance information, and issues notifications to appropriate end-users when new or updated resident data is detected... among other tasks.

  4. ** Install Security Filters ** For the resident services filters are installed to enforce security constraints.

  5. ** Install My-Proxy client. ** As part of the publishing process the data node uses Grid credentials obtained from a myproxy server at the gateway. For this a myproxy client is required. The client will be installed into $GLOBUS_LOCATION/bin . ** NOTE: ** The PCMDI myproxy server listens on the non-standard port 2119.

    • ** Install GridFTP ** There are are two configurations of GridFTP installed. The default is what we call the "end user" configuration. There is also another configuration for performing large transfers using the ** B ** ulks ** D ** ata ** M ** over. There are two libraries that must be installed on your system as a prerequisite to the GridFTP install - they are listed here .
  6. ** Publish Test Dataset ** The script will execute myproxy-logon to acquire Grid credentials from the gateway and then will publish the test dataset to the gateway.

Note: Before you can successfully publish you must first get your credentials from the gateway you selected.
You will be prompted for them at the final publish step.
  • = installed only under the "data" configuration type (--type data)

Directory structure

esg_root_dir = /esg
workdir = /usr/local/src/esgf

** Location **

** Description **

${esg_root_dir}

Top level directory location of the ESG configuration files and logs (default /esg).

${esg_root_dir}/backups

Application stack and database data archive location

${esg_root_dir}/config

ESGF configuration files

${esg_root_dir}/content

Thredds catalogs & LAS data files

${esg_root_dir}/data

Top level directory for data (.nc) files

${esg_root_dir}/data.replica

Top level directory for all replicated data from other nodes

${esg_root_dir}/data-index-*

Search index directories

${esg_root_dir}/gridftp_root

Chroot directory for gridftp access to data

${esg_root_dir}/log

ESGF log files

${esg_root_dir}/tools

ESGF tools (currently; esg_usage_parser)

${esg_root_dir}/config/esgcet/esg.ini or ~/.esgcet/esg.ini

esg publisher setup file. (system vs personal install)

%{esg_root_dir}/etc

Ancillary scripts and files

${esg_root_dir}/esgf-install-manifest

Log of all installed components of the application stack (date, name, location, version)

/etc/esg.env

Environment variables required by the script and used in node operation

${workdir}

Installation "scratch" directory for installation-time artifacts; source, helper scripts, et. al.

${workdir}/globus

globus sources

${workdir}/esg

publisher,thredds and other sources


Starting/Stopping etc. the Node

The esg-node installation script is also the boot script

To stop/start or restart or check the status of the node...

%> esg-node stop
%> esg-node start
%> esg-node status
%> esg-node restart

These and all the flags supported by the esg-node script can be gotten by using the ** --help ** option

%> esg-node --help

     usage:
     (as root)
     esg-node ([--<directive>] | [start] | [stop] | [status] | [restart]
...

It may be placed in /etc/init.d and installed in the host's boot sequence using _ chkconfig _ . This means it follows the _ standard options _ for all service scripts, namely ** start ** , ** stop ** , ** restart ** , ** status ** (as just described above). To add to the host's boot sequence do the following...

%> cp /usr/local/bin/esg-node /etc/init.d
%> cd /etc/init.d
%> chkconfig --add esg-node

installer page


What dependencies does the esg-node script download?

Taken from inspecting the esg-node script at commit 0d9e2d2 on 2011-09-23

External dependencies

See releases page

Installer subsystems

See git repositories

TODO

  • Downloads from subsystems
Clone this wiki locally