...
Failure Type | Failure parameter | Failure Event | Infrastructure Metrics | Comments |
---|---|---|---|---|
Links | Link Down. Link removed | Virtual Switch Bonding link failure Reason: Hardware Failure | Network interface status, High packet drop, low throughput, excessive latency or jitter | |
VM | Deployment/Start Failures:
Post-Deployment/Start failures:
| nova-compute.log nova-api.log nova-scheduler.log libvirt.log qemu/$vm.log neutron-server.log glance/cinder - flavor Node and Core-mapping | cpu: per-core utilization memory Interfaces statistics - sent, recv, drops Disk Read/Write | If possible, Infrastructure metrics and syslogs from within the VM should be collected. Deployment/Start failures can be the first step. |
Container | Deployment/Start Failures:
Post-Deployment/Start failures:
| cpu: per-core utilization memory Interfaces statistics - sent, recv, drops Disk Read/Write | ||
Node | A node failure (hardware failure, OS crash, etc) A) node network connectivity failure B) nova service failure D) Failure of other OpenStack services | A) node network connectivity failure
B) nova service failure (e.g., process crashed) -- detected and restarted by a local watchdog process
D) Failure of other OpenStack services -- N/A, assuming redundant/highly available configuration
| Interfaces statistics - sent, recv, drops Hypervisor Metrics, Nova Server Metrics, Tenant Metrics, Message Queue Metrics Keystone metrics and Glance Metrics | |
Application | Crash/Connectivity/Non-Functional | Application Log i.e. If it is Apache then logs of Apache | Packet Drops, Latency, Throughput, Saturation, Resource Usage | Deploy Collectd within the application and collect both application logs and infrastructure metrics |
Middleware Services |
...