This section describes how to troubleshoot some of the common issues you may face when installing the Deploy application using Operator-based installer.
Error while processing YAML document at line 1 of XL YAML file
Symptom
Solution
When the deployment starts, XL CLI script fails to run and displays the following error message:
Restart the deployment using XL CLI.
xl apply -v -f digital-ai.yaml
Note: The above command applies the digital-ai.yaml wrapper file that bundles all other files, such as infrastructure file, environment file, and so on.
Only the Operator control manager pods gets deployed on the Kubernetes cluster
Symptom
Solution
The XL CLI script runs successfully, but only the Operator control manager pods are deployed on the Kubernetes cluster. No other pods are deployed.
Clear the Operator deployment as follows:
Run the following command: Kubectl get crd
Delete the Operator corresponding to CRD: kubectl delete crd digitalaideployocps.xldocp.digital.ai
Go to /digital-ai/kubernetes/template path in extracted ZIP file, and run the following command: kubectl delete -f
Restart the deployment using XL CLI.
Note: To troubleshoot the issue on Openshift AWS cluster, replace the kubectl command with oc.
Deployment activation fails after deleting operator
Symptom
Solution
After deleting the operator customer resource definition (CRD) and the operator, the redeployment process fails to create pods when you attempt to activate the deployment process by running the following command:
xl apply -v -f digital-ai.yaml
If you do not have a local Deploy instance, only then use the kubectl delete -f command to remove the Deploy instance. If you have a local Deploy instance with deployment details, use the make undeploy command to remove the Operator, and retry the deployment process.
Upgrade to Operator-based solution fails
Symptom
Solution
The upgrade to Operator-based solution from the Helm Charts-based solution fails.
Restore the database instance.
Clean the deployments. For more information, see Uninstall Deploy.
Update the daideploy_cr.yaml for deploy to use the external database as follows:
Search for UseExistingDB parameter in the daideploy_cr.yaml for Deploy.
Set Enabled parameter to True.
Remove the comment tag from the following parameters:
XL_DB_PASSWORD
XL_DB_URL
XL_DB_USERNAME
Update the external database credentials.
Redeploy the Deploy instance.
The upgrade Operator-to-Operator solution, fails with following error: “Fetching values from cluster… / Missing CRD and CR resources during Upgrade, Could not upgrade: exit status 1”
During the upgrade, the CRD and CR resources are backed up in daideploy_cr_<version>.yaml file. To troubleshoot the issue:
Restore CRD using following command: kubectl apply -f daideploy_cr_<version>.yaml
Restart the upgrade.
If the CRD or CR files are not backed up, then you can only perform a fresh installation after performing a cleanup. To clean up the existing resources, run the following command: xl op --clean or Run the following cleanup script: kubectl delete crd digitalaideploys.xld.digital.ai kubectl delete role xld-operator-leader-election-role kubectl delete clusterrole xld-operator-manager-role kubectl delete clusterrole xld-operator-metrics-reader kubectl delete clusterrole xld-operator-proxy-role kubectl delete rolebinding xld-operator-leader-election-rolebinding kubectl delete clusterrolebinding xld-operator-manager-rolebinding kubectl delete clusterrolebinding xld-operator-proxy-rolebinding kubectl delete service xld-operator-controller-manager-metrics-service kubectl delete deployment xld-operator-controller-manager