Notes for contributors:
- This doc is under development and open for all inputs.
- This docs tries to define and cover all checks that can be run on a cloud platform to validate its state/health.
- Validation checks targets to ensures that all cloud software components are healthy/configured as described in PDF as desired cloud state.
- All validation checks combined can be considered as a kind of ping utility for a cloud platform.
- Functional tests or Performance Benchmarks are out of scope for this list.
Currently defined Cloud Deployment Types:
- OOK - OpenStack on Kubernetes
- OOO - OpenStack on OpenStack
- OAC - OpenStack as Containers (without Kubernetes)
- OAV - OpenStack as VMs
Test Suites are used to group multiple checks.
Sl.No. | Test Suite | Test/Check Name | Cloud Deployment Types | Description/Details |
---|---|---|---|---|
1. | platform | pod_health_check | OOK, OAC | Checks health of all overcloud components running as pod in Kubernetes cluster(in OOK deployment). Pass: All components are healthy. Fail: One or more components are unhealthy. |
2. | storage | ceph_health_check | OOK, OAC, OOO, OAV | Checks health of all components of ceph cluster configured for OpenStack. |
3. | observability | prometheus_check | OOK | Check health endpoints(https: "/-/healthy") and readiness endpoint("/-/ready") of prometheus. Pass: On pass of both healthy and ready check. Fail: If readiness of healthy not true. |
4. | observability | prometheus_alert_manager_check | OOK | Check whether Alert Manager is ok by sending https request to "/-/healthy" and "/-/ready" endpoints of the alert manager. |
5. | observability | grafana_check | OOK | Checks whether Grafana is healthy by sending request at /api/health endpoint. |
6. | observability | elasticsearch_check | OOK | Checks health of elasticsearch cluster by sending https request at "/_cluster/health" endpoint of the Elasticsearch cluster. |
7. | observability | kibana_check | OOK | Kibana Dashboard health check, checks health using status at "/api/status" endpoint. |
8. | observability | nagios_check | OOK | Check whether Nagios api is reachable and gives https_OK |
9. | observability | elasticsearch_exporter_check | OOK | Check whether elasticsearch exporter is exporting prometheus metrics at "/metrics" |
10. | observability | fluentd_exporter_check | OOK | Check whether fluentd exporter is exporting prometheus metrics at "/metrics" |
11. | network | physical_network_check | OOK, OAC, OOO, OAV | Checks network mappings in ml2.conf against PDF. |
12. | compute | reserved_vnf_cores_check | OOK, OAC, OOO, OAV | Checks vcpu_pin_set configurations in nova against the required PDF value for reserved cores. |
13. | compute | isolated_cores_check | OOK, OAC, OOO, OAV | checks isolcpus configuration against required value in PDF. |
14. | network | vswitch_pmd_cores_check | OOK, OAC, OOO, OAV | Evaluates pmd-cpu-mask in vswitch against required cores in PDF. |
15. | network | vswitch_dpdk_lcores_check | OOK, OAC, OOO, OAV | Evaluates dpdk-lcore-mask in vswitch against required cores in PDF. |
16. | compute | os_reserved_cores_check | OOK, OAC, OOO, OAV | Calculates os_reserved_cores using formula: os_reserved_cores = all_cores - (reserved_vnf_cores + vswitch_pmd_cores + vswitch_dpdk_lcores) and compares against required os_reserved cores in PDF. |
17. | compute | nova_scheduler_filters_check | OOK, OAC, OOO, OAV | |
18. | compute | cpu_allocation_ratio_check | OOK, OAC, OOO, OAV | |
19. | platform | api_version_check | OOK, OAC, OOO, OAV | |
20. | network | mtu_check | OOK, OAC, OOO, OAV | |
21. | platform | ntp_check | OOK, OAC, OOO, OAV | |
22. | network | sriov_vfs_check | OOK, OAC, OOO, OAV | |
23. | security | pod_linux_capabilities_allowed_check | OOK | |
24. | security | previleged_pod_allowed_check | OOK | |
25. | security | pod_host_volume_mount_check | OOK | |
26. | security | pod_host_network_check | OOK | |
27. | security | mgmt_api_access_check | OOK | |
28. | compute | cpu_manager_policy_check | OOK | |
29. | compute | topology_manager_policy_check | OOK | |
30. | network | cni_check | OOK | |
31. | platform | device_plugin_check | OOK | |
32. | service_mesh_check | OOK | ||
33. | ingress_egress_check | OOK | ||
34. | platform | kubevirt_check | OOK | |
35. | helm_check | OOK | ||
36. | platfrom | readliness_probe_check | OOK | |
37. | platform | startup_probe_check | OOK | |
38. | platform | liveliness_probe_check | OOK | |