Set up an active-active cluster

This topic describes how to set up an active-active cluster for XL Deploy. Running XL Deploy in this mode enables you to have a Highly Available (HA) XL Deploy setup with improved scalability.

Requirements

Running XL Deploy in an active-active cluster requires the following:

  • XL Deploy must be installed according to the system requirements. For more information, see requirements for installing XL Deploy.
  • A load balancer that receives HTTP(S) traffic and forwards that to the XL Deploy master nodes. For more information, see the HAProxy load balancer documentation.
  • Two or more XL Deploy master nodes that are stateless and provide control over the workers and other functions (e.g. CI editing and reporting).
  • Two or more XL Deploy worker nodes that contain and execute tasks and are configured to connect to all masters.
  • A database server.
  • A shared drive location to store exported CIs and reports.

Image

Basic functions and communication in a cluster

The communication between the masters and the workers is done through a two-way peer-to-peer protocol using a single port for each master or worker node.

The majority of XL Deploy functions and configurations are identical for a cluster setup as for a single instance. The exception is that, in the cluster setup, the functions can operate on all masters and/or on all workers.

Planning phase for cluster setup

When planning an active-active cluster for XL Deploy, make sure you are aware of the following:

  1. All masters and workers must have the same configuration, which consists of:

    • The plugins
    • The extensions folder (e.g. for scripts)
    • The configuration files (some parts will be node specific)
  2. All masters and workers mast have access to the database.
  3. All masters and workers must have access to the artifacts.
  4. Communication between masters and workers requires a low latency, high bandwidth network.
  5. All masters and workers need access to all target hosts (and XL Satellites, if applicable).
  6. For the HTML5 UI to function correctly, all requests for a single user session must be handled by the same master.
  7. For exports of CIs and reports to work correctly across masters, the export/ folder should be a shared and read-write accessible volume for each master and worker.

Recommendations

Based on the planning phase considerations, these settings are strongly recommend:

  1. All masters and workers are part of the same network segment.
  2. The network segment for the masters and workers is properly secured.
  3. The hostnames and IP addresses for all masters and workers are stored and maintained in a DNS server that is accessible to all masters and workers.
  4. The load balancer is configured to be highly available and can properly reach the masters.
  5. The load balancer handles SSL and forwards unencrypted data.
  6. The load balancer is configured with session affinity (“Sticky sessions”).
  7. The database is configured for high availability and can be properly reached by masters and workers.
  8. Artifacts are stored in the database (or preferably in (an) external system(s)).
  9. When XL Satellite is used, all communication between masters, workers and satellites is secured using SSL Certificates.

The configuration of the load balancer, the network, and the database is not covered in this document.

Setup and configuration

When setting up a new system, the setup procedure should be executed on a single master node and the resulting configuration files shared with other nodes (masters and workers).

When upgrading, the upgrade procedure should be executed on all masters and workers.

In both cases, the configuration files to be shared between the masters and the workers include:

  • The deployit.conf, deployit-defaults.properties and xl-deploy.conf files
  • The license (deployit-license.lic)
  • The repository keystore (repository-keystore.jceks or repository-keystore.p12)
  • The truststore (if applicable)

Each master and worker should:

  1. define its fully qualified host name in the configuration property xl.server.hostname in XL_DEPLOY_SERVER_HOME/conf/xl-deploy.conf.
  2. configure the correct key-store and trust-store in the xl.server.ssl section (if SSL is enabled) including certificates for XL Satellites (if applicable). Follow these instructions to set it up.

To start master and worker nodes:

  • Masters can be started with the normal procedure, e.g. invoking bin/run.sh or bin\run.cmd.
  • Workers can be started with the literal ’worker’ as the first argument to bin/run.sh or bin\run.cmd; one -api flag pointing to the load balancer; and one or more -master flags, one for each fully qualified master name. E.g.:

    bin/run.sh worker -api http://xld-loadbalancer.example.com -master xld1.example.com:8180 -master xld2.example.com:8180

    See also Scalability for Masters below. Further switches can be applied when starting workers, see the documentation on workers for more information.

NOTE if no DNS server is used and the mapping is done using /etc/hosts or a similar local mechanism, the configuration setting xl.tasks.system.akka.io.dns.resolver must be set to inet-address in XL_DEPLOY_SERVER_HOME/conf/xl-deploy.conf on all masters and hosts.

Scalability

A running active-active cluster for XL Deploy can be scaled for better performance if properly configured.

When using SSL for communication between masters, workers and satellites, the certificates of new masters and workers must be trusted by the other nodes and satellites. In this case it is recommended to use a trusted root certificate to sign all certificates used by masters and workers and satellites. A (self-signed) root certificate can be added to the trust store.

Scalability for workers

Additional workers can be started and directed to an existing cluster of workers without additional configuration.

It is important to note that scheduled or on-going work (tasks) will not be re-balanced when adding workers. All workers are assigned tasks in a round-robin fashion when a task is created on one of the masters. Once a task is assigned to a worker, it cannot be moved to another worker.

Scalability for masters

To enable workers to find masters that are added while the workers are running, available workers should be registered in a DNS SRV record.

xld-masters      IN      SRV  1  0  0  xld-master-1
xld-masters      IN      SRV  1  0  0  xld-master-2
...

The workers can now be started with a single -master parameter that points to the SRV record: -master xld-masters:8180.

The port number for a master can be configured in the DNS SRV record or in the parameter value:

xld-masters      IN      SRV  1  0  9001  xld-master-1

defines the port to be used for xld-master-1 to be 9001. If the port in the DNS SRV record is 0, it is ignored. A parameter value of -master xld-masters:9002 means that all masters found in the DNS SRV record will use port 9002. The port number in the DNS SRV record has higher preference.