Barometer Development Updates

Anuket Project

Barometer Development Updates

Collectd plugin development status

Plugin

Status

Description

Comments

OPNFV Release

Openstack Release

Plugin

Status

Description

Comments

OPNFV Release

Openstack Release

Hugepages

upstreamed

Plugin provides the free and used hugepage numbers/bytes/percentage

Useful for intelligent workload placement for VMs backed by hugepages

D

P

Cache

upstreamed

Plugin provides the last level cache utilitzation and memory bw utilization

Based on Resource Directory Technology

E

P

DPDK stats

upstreamed

Plugin provides the extended NIC stats for DPDK interfaces

 

D

P

DPDK events

upstreamed

Plugin provides the packet processing core status and the link status for DPDK interfaces

 

E

P

RAS Memory

upstreamed

Plugin uses mcelog client protocol to check for memory Machine Check Exceptions and sends the stats for reported exceptions.

 

D

P

BIOS

Reworked as a utility, waiting on snmp write plugin to be upstreamed

 

 

E

Q

Open vSwitch Stats

upstreamed

Plugin provides the OVS stats for interfaces. Plugin is DPDK agnostic and uses the OVS DB.

 

E

P

Open vSwitch Events

upstreamed

Plugin provides the OVS link status for interfaces. Plugin will also report vswitch liveliness.

 

D

P

SNMP write

upstreamed

Plugin will act as a SNMP subagent and will map collectd metrics to relavent OIDs. Will only support SNMP: get, getnext and walk.

 

E

Q

Legacy/IPMI

upstreamed

Plugin will report platform thermals, voltages, fanspeed....

 

E

Q

RAS other errors

Implementation

Parsing and filtering utility for logfile implemented https://github.com/collectd/collectd/pull/2154 - reviewed, reworked and pending further review

Plugin will parse the mcelog/syslog for exceptions that are not memory exceptions.

 

E

Q

Libvirt extensions

Upstreamed

Extend the libvirt plugin to include all relavent stats and events that are available for a libvirt Domain

 

E

Q

Python Notification

upstreamed

Extend the python language binding to pass the collectd metadata to write/notification plugins.

 

E

Q

PMU

upstreamed

Plugin will retirieve performance monitoring units (PMUs) that allow to count and sample a wide variety of events.

 

E

Q

PCIe AER

https://github.com/maryamtahhan/collectd/tree/feat_pcie_errors

Plugin will poll PCI config space for baseline and AER errors. It will also parse syslog for AER events. Any errors to be reported via notifications.

 

E

Q

VES Collectd Agent update schema 5.1

Merged to Batometer Repo

 

 

E

 

OVS events Make the polling option configurable

upstreamed

 

 

E

 

mcelog updates

add a persist option for events - implemented - Pending Manual Test

send corrected and uncorrected errors in separate notifications with different severity

Merged to master

 

 

E

 

dpdkstat bugfixing

Enable all ports - default config - Done

 

 

E

 

Collectd extension

Internal protocol Extension - PUT to send commands to teh daemon

New plugin registration function

Dynamic Plugin - Load/unload

Internal code review completed, manual testing to be scheduled prior to upstreaming

 

 

 

E

 

QAT

Looking to leverage existing PCIe AER plugin to report QAT errors

Error reporting from Accelerator

 

E

 

Collectd RPM build

Merged

Will provide a base for the Apex installer work

 

E

 

OpenStack Boston Demo

Complete

Noisy Neighbour detection a collaboration with Intel RDT and Vitrage

 

 

 

Libvirt Plugin bugfixes

In progress

Domain restart causing incorrect CPU utilization Values to be reported

inconsistency in memory metrics reported by libvirt and virsh

 

 

 

DPDK events bugfixes

Merged to collectd master

Reconnect to DPDK primary

wrong plugin config issues

respawn secondary after primary restart

 

 

 

Container /Cadvisor Support

Looking at integrating with Prometheus.

Investigating events support.

 

 

 

 

Process Monitoring

Design Ready

Add a method which will accept a userspace process name as a parameter and return its pid

 

 

 

OVS with DPDK PMD stats

Merged to barometer repo

Creating a script that will pull OVS-with-dpdk PMD stats and publish them to collectd. these stats are not retrievable through the OVSDB - but should be in the future - which is why this integration is stand alone for now

 

 F

OVS Multi Instance Support

Pull request created:

https://github.com/maryamtahhan/collectd/tree/feat_ovs_multi_instance

OvS DB multi-instance support for ovs_event/ovs_stats plugins

 

 

 

CI build

automated, RPMs generated nightly and pushed to http://artifacts.opnfv.org/barometer.html

 

 

 

Add PMU build to src/

Under Review

 

 

 

 

Create a docker image for barometer

Complete

 

 

Integrate with Apex

Integrated with 2 scenarios
apex-os-nosdn-bar-ha-baremetal-master

apex-os-nosdn-bar-noha-baremetal-master

 

 

 F

 Q

VES Kafka integration

Separate out VES collectd application from collectd so that it can be apache licensed, and use a kafka bus to send info from collectd to this VES collector...

 

 

 F

VES configuration through a schema

Under internal review

 

 

 

 

Connectivity. Event-based, interface monitor

Pull Request created: https://github.com/collectd/collectd/pull/2407

This plugin monitors an Ethernet interface or group of ethernet interfaces and reports link status changes. The plugin does not poll for link status but rather recieves status events eliminating the polling overhead. Detection times vary by device driver, but typical detection times are below 10ms.

 

 F

 Q

Procevent. Event-based process monitor.

https://github.com/collectd/collectd/pull/2623

This plugin monitors netlink process events and looks for status changes.



F

Q

Sysevent. Event-based syslog monitor

https://github.com/collectd/collectd/pull/2624

This plugin monitors syslog and uses regex's to determine when to sent an event



F

Q

 

Open Pull Requests