Anuket Project

Libvirt Plugin High Level Design Document

Requirement

1.0

Expose metrics from virtualized guests on a host system

 

2.0

Use libvirt API

 

3.0

Extend list of existing virt collectd plugin metrics for RDT monitoring

 

4.0

Should have a configurable interval

 

5.0

Provide SNMP support for added metrics

 

 

Overview

                Libvirt is an open source project licenced under GNU LGPL, which aims to provide a simple and convenient way of managing virtual machines created by various different hypervisors in a unified manner. It is an abstraction layer that provides a common application programming interface (API) for numerous functionalities implemented by hypervisors. Even if particular functionality is at the moment implemented only by a single hypervisor, it has to be exposed in a generic way to allow to support multiple hypervisors in the future if they eventually choose to implement the same feature. It makes learning hypervisor-specific tools no longer necessary, which is very advantageous and desirable characteristic in cloud-based environments. Major libvirt features include:

  • Virtual machine management – domain lifecycle management, device hotplug operations
  • Remote machine support – support for multiple network transports for remote connection
  • Storage management – creation of file images, mounting NFS shares
  • Network management – physical and logical network interfaces management
  • Virtual NAT and route based networking – management and creation of virtual networks

Libvirt consists of 3 software components:

  • API library
  • Daemon - libvirtd
  • Command line utility – virsh

Libvirt is based on client-server architecture, which requires daemon to be installed only on machine that will host virtualized guests. For the purpose of controlling domains remotely libvirt is using a custom protocol to establish connection with remote libvirtd instance. Although, it should be noted that domains can also be controlled locally, that is, client and server can be run on the same physical host.

 

Design

virt plugin

The virt plugin collects statistics by using virtualization API. Metrics are gathered directly from the hypervisor on a host system, which means that collectd doesn’t have to be installed and configured on a guest system. Plugin has been extended to support following metrics:

 

Type

Type Instance

Description

Comment

cpu_affinity

vcpu_NR-cpu_NR

Pinning of domain VCPUs to host physical CPUs.

Value stored is a boolean.

job_stats

*

Information about progress of a background/completed job on a domain.

Number of metrics depend on job type. Check API documentation for more information: virDomainGetJobStats

disk_error

DISK_NAME

Disk error code

Metric is not dispatched for disk with no errors

percent

virt_cpu_total

CPU utilization in percentage per domain

Computing percentage CPU utilization requires 2 samples of CPU time used by domain. Therefore, this metric will not be available on first dispatch of metrics after loading plugin or on first dispatch after domain restart operation.

perf

*

Performance metrics

Number of metrics depend on libvirt API version. List of available perf metric can be found at: libvirt.org

 

 

Additionally, support for following events has been added:

 

Type

Severity

Description

Comment

domain_state

OKAY:

  • VIR_DOMAIN_NOSTATE

  • VIR_DOMAIN_RUNNING

  • VIR_DOMAIN_SHUTDOWN

  • VIR_DOMAIN_SHUTOFF

WARNING:

  • VIR_DOMAIN_BLOCKED

  • VIR_DOMAIN_PAUSED

  • VIR_DOMAIN_PMSUSPENDED

FAILURE:

  • VIR_DOMAIN_CRASHED

 

Domain state and reason in a human-readable format.

 

file_system

OKAY

File system information:

  • mount point

  • device name

  • file system type

  • number of aliases

  • disk aliases

Information stored in metadata. Requires Guest Agent to be installed and configured in VM.

 

 

Note:

                Certain metrics and events have a requirement on a minimal libvirt API version. For more information, please see collectd virt plugin documentation.

Plugin configuration

Extended metric can be enabled with virt plugin configuration option:  

Name

Selectors

Description

Comment

ExtraStats

  • disk_err
  • domain_state
  • fs_info
  • job_stats_background
  • job_stats_completed
  • perf
  • vcpupin

Defines whether additional statistics should be collected. By default no extra statistics are gathered, preserving the previous behavior of the plugin.

The argument is a space-separated list of selectors

 

Here is an example of the plugin configuration section of collectd.conf file:

<Plugin virt>

RefreshInterval 60

ExtraStats "cpu_util disk_err domain_state fs_info job_stats_background perf vcpupin"
</Plugin>

Note:

Detailed information about plugin configuration options can be found in collectd virt plugin documentation.

SNMP Support

libvirt-snmp, libvirt subproject, offers only limited SNMP functionality, which doesn’t cover extended metrics enabled by SA project. If possible, existing MIB should be used for basic set of metrics and new MIB should be defined for extended metrics. Otherwise, new libvirt MIB should contain all metric gathered by virt collectd plugin. SNMP support can be achieved by adding proper configuration for snmp_agent collectd plugin. See description of SNMP feature for more details on snmp_agent plugin.

Considerations

Configuration Considerations

Deployment Considerations

It’s not recommended to enable extended perf metrics in conjunction with the Intel RDT plugin as both plugins are using same system resources.

Perf metric have to be supported by the platform.

API/GUI/CLI Considerations

Equivalence Considerations

The SNMP support for extended metric will be enabled by newly defined MIB.

Security Considerations

Alarms, events, statistics considerations

Certain hypervisors might not support all the metrics collected by the plugin. Unsupported metrics will not be reported.

Redundancy Considerations

Performance Considerations

Testing Consideration

Other Considerations

Impact

The following table outlines possible impact(s) the deployment of this deliverable may have on the current system.

 

Ref

System Impact Description

Recommendation / Comments

1

 

 

Key Assumptions

The following assumptions apply to the scope specified in this document.

 

Ref

Assumption

Status

1

 

 

Key Exclusions

The following exclusions apply to the scope discussed in this document.

 

Ref

Exclusion

Status

1

 

 

Key Dependencies

The following table outlines the key dependencies associated with this deliverable.

 

Ref

Dependency

Status

1

libvirt

 

2

libxml2

 

3

 

 

4

 

 

Issues List

Ref

Issue

Status

1