...
Parameters\Tools | Collectd | Ceilometer Polling agent. | Monasca | SNAP | node-exporter and other exporters | sensu client: metric collection plugins | munin | telegraf | NRPE + Plugins | diamond | Reimann | Elastic Beats | Centreon | (NSClient++(Same as NRPE, ICINGA, OpenNMS) | icinga (Same as NRPE) | OpenNMS (Same as NRPE) | diamond | Reimann | Elastic Beats | Note: 1. For some parameters the answer could be just YES/NO, 2. Whereas, for some we may have to provide a description/details 3. For some we may have to choose from the list [], whereas for some we may append a value to the list. 4. For some parameters, please provide the number of 'actual metrics' provided under that category. For example, collectd would provide 12 metrics for Processes-category Use NA - If Not applicable. Use NK - If it is Not Known | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CPU metrics | idle, system, wait, stolen, user (% & time), util, vcpus | idle, system, wait, stolen, user (% & time), util, vcpus | idle, system, wait, stolen, user (% & time) | idle, system, wait, stolen, user, guest, irq, nice (% & jiffies) | idle, system, wait, stolen, user (% & time), util, vcpus | idle, system, wait, stolen, user (% & time), util, vcpus | Freq, usage - idle, system, wait, user, util and vcpus. | Same as ceilometer or monasca | user, system, iowait, idle in (% and time). average-load | idle, system, wait, user, nice. | idle, system, wait, user, nice, stolen, irq | idle, system, wait, user, nice, stolen, irq | ||||||||||||||||||||||||||
Disk IO metrics | Read and write (bytes, rate, time, sectors) disk-free | read and write (bytes, rate, req) | read and write (bytes, rate, req) | read and write (ops, octets, merged, time) disk-free | read and write (bytes, rate, req) | Read and write (bytes, rate, time, sectors) | read and write (bytes, rate, req) | Same as ceilometer or monasca | read and write (ops, octets, merged, time) disk-free | read and write (bytes, rate, req) | read and write (merged, sector, time, req) io- reqs, time, weighted | read and write (count, time and bytes) | ||||||||||||||||||||||||||
Memory metrics | free, swap, total, used (bytes and percetages) | usage, bandwidth | free, swap, total, used | free, available, total, used. | free, swap, total, used | free, swap, total, used (Mb and percentages) | free, swap, total, used, slab. | Same as ceilometer or monasca | free, available, total, used. (bytes, %ges) | free, total, swap, active, dirty, inactive, buffers. | free, used, (bytes and %ges) actual-used. | free, used, (bytes and %ges) actual-used. | ||||||||||||||||||||||||||
Process metrics | I/O, memory, CPU-Usage, read-write (bytes and count) | NO | NO | I/O, memory, CPU-Usage, (bytes and count). | Same as collectd. | status, thread-count, uptime. IO, memory, cpu-usage. connections. | Cpu and memory, read-write (bytes, count), and various other fields | Cpu and memory, read-write (bytes, count) | CPU, memory, uptime, | btime, ctxt, processes, blocked, running | I/O, memory, CPU-Usage, read-write (bytes and count) | I/O, memory, CPU-Usage, read-write (bytes and count) | ||||||||||||||||||||||||||
Network Interface Network Interface Metrics | Interface plugin: Standard 4 fields of rx/tx (octets, packets, errors, dropped). Netlink plugin: uses netlink sockets and covers others | Standard 4 fields of rx/tx (octets, packets, errors, dropped). | Standard 4 fields of rx/tx (octets, packets, errors, dropped). | sent and recv : bytes, compressed, drops, errors, fifo, frame, multicast, packets | Standard 4 fields of rx/tx (octets, packets, errors, dropped). | Standard 4 fields of rx/tx (octets, packets, errors, dropped). Also includes, fifo, compressed, and frame stats. | rx/tx (octets, packets, errors, dropped). | Same as ceilometer or monasca | rx/tx (octets, packets, errors, dropped). SNMP (3) | Rx and Tx. MBs | Standard 4 fields of rx/tx (octets, packets, errors, dropped) | Standard 4 fields of rx/tx (octets, packets, errors, dropped). | ||||||||||||||||||||||||||
Libvirt Metrics | YES - | YES | YES | YES | YES | NO | NO | NO | YES | YES | NO | NO | ||||||||||||||||||||||||||
Container resource usage Monitoring (memory, restarts, status, uptime, etc) | YES | NO | NO | Docker | Docker | Docker | NO | Docker | YES (Docker, LXC) | Docker | YES (Docker) | YES (4) | ||||||||||||||||||||||||||
Databases Monitoring : [Influxdb, MongoDb, MySql, PostgreSql, Carbon(graphite), Prometheus, RRDCache,Redis, TSDB] | YES for all | MySql, PostgreSql, MongoDb | Influxdb, Vertica, MySql, PostgreSql, Cassandra | Influxdb, mysql, mongodb, Cassandra | ALL (4) | All | NO | All. | YES for all | MongoDb, mysql, postgresql, and Redis | YES for all | YES for all (4) | ||||||||||||||||||||||||||
Publish metrics to databases - (influxdb, mysql, TSDB, Postgresql, MongoDb, Carbon, Elasticsearch) | YES for all | NO | NO | YES for all. | NO | NO (1) | NO | Yes for all | NO | Yes for All | YES for all. | YES (4) | ||||||||||||||||||||||||||
Encryption Support | YES | NO | NO | YES | NO | NO | NO | NO | YES | YES | YES | YES | ||||||||||||||||||||||||||
Language (written) | C | Python | Python | Go | Go | Ruby | Perl | Go | perl, shell, c, (varies) | Python | Varies - ruby, c, c++, etc. | Go | ||||||||||||||||||||||||||
Extensibility - multilanguage Extensibility - multilanguage support [Python, Java, Golang, C/C++, Lua] | YES for all | Java | Java | Python C++ | Java, Python, Ruby | Go, Python. | Python, Ruby | None. | Perl, shell, C. | None | Multiple | NO? | ||||||||||||||||||||||||||
Interoperability [with other monitoring solutions] | Sensu, statsd, telegraf? | Nagios zabbix | ceilometer | Ceilometer, Facter, Reimann, Prometheus | Collectd | Nagios, Zabbix. | NO | Reimann | NSClient, Icinga. | Nagios | Collectd | Collectd? | ||||||||||||||||||||||||||
Write to Message Queues and protocols (AMQP, Kafka, MQTT, NSQ) | YES for ALL | AMQP | Kafka | AMQP, Kafka. | NO | AMQP | NO | kafka, MQTT, NSQ | NO | Yes for ALL | YES for all | YES for all (4) | ||||||||||||||||||||||||||
Metrics Pub/sub Mode Support (Metrics push/pull mode support ?) | YES | YES | YES | YES | YES | YES | NO | YES | NO | YES | YES | YES | ||||||||||||||||||||||||||
Metrics Req/Resp Mode Support | NO | NO | NO | YES | NO | YES | YES | NO | YES | NO | YES | YES | ||||||||||||||||||||||||||
Support for Events (polling, Pushing) | Yes | NO (1) | NO (1) | NO | NO | YES | NO | YES | YES | NO | YES | YES | ||||||||||||||||||||||||||
Notification Support | YES | NO (1) | NO (1) | NO | NO (1) | YES | NO | NO | YES | NO | YES | YES | ||||||||||||||||||||||||||
Logging Support | YES | Logging Support | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | |||||||||||||||||||||||
Hypervisor metrics | YES | NO | NO | YES (KVM) | YES | YES (XenTop) | NO | NO | YES | XEN, KVM. | NO | NO | ||||||||||||||||||||||||||
Log-File Analysis | YES | NO | NO | YES | YES (mtail) | NO | NO | YES | YES | NO | YES | YES | ||||||||||||||||||||||||||
Other Writing (output) Support: [CSV, HTTP, RRD, UnixSocket, Multicast] | ALL that are listed. | NO | NO | NO | HTTP | NO | RRD | Socket, | NO | HTTP | NO | YES? | ||||||||||||||||||||||||||
Transport Protocol | Depends on the end point it's communicating with. | TCP* | TCP* | TCP | TCP, UDP. (5) | TCP | TCP | TCP, UDP | TCP | TCP | TCP, UDP | TCP, UDP | ||||||||||||||||||||||||||
Data-Format [XML, JSON, etc] | JSON, Custom, XML | JSON XML | JSON | JSON | JSON ? | JSON | Custom | Custom | Custom | JSON | Custom | Custom, JSON | ||||||||||||||||||||||||||
Data-model | Custom | KVP | KVP | KVP | KVP | KVP | Custom | Custom | Custom | KVP | KVP | KVP | ||||||||||||||||||||||||||
Hardware: IPMI, Hardware: IPMI, Battery, Sensors, | YES for all | IPMI | IPMI | IMPI | YES for all | YES - IPMI | YES (3) | IPMI sensors | YES | NO | NO? | YES for all | ||||||||||||||||||||||||||
Metric Types: Guage, Derive, Counter, absolute | YES for all | Gauge cumulative delta | Gauge, rate, counter. | gauge, derive, counter. | Gauge, Counter, Histogram, summary | Gauge, Counter, derive. | Gauge, Counter, derive. | Gauge, Counter. | Gauge, Derivative, delta | Gauge, sum, counter, derive | Gauge, sum, counter, derive | |||||||||||||||||||||||||||
Last-Updated | 2017 | 2017 | 2017 | Varies(5) | Varies (5) | Varies (5) | Varies (5) | 2017 | varies(5) | Varies (5) | Varies(5) | Varies(5) | ||||||||||||||||||||||||||
Commercial Versions? | NO | NO | ? | NO | NO | YES | NO | No | YES | YES? | YES? | YES? | Resource consumption by the agent | |||||||||||||||||||||||||
Run-Time Analysis [^] | CPU: 14.8% | CPU:17.5% | ||||||||||||||||||||||||||||||||||||
License | MIT/GPL v2 or later | Apache License, Version 2.0 | Apache License, Version 2.0 | Apache License, Version 2.0 | Multiple (5) | MIT | GPL V2. | MIT | GPL V3 | MIT | MIT | Apache License, Version 2.0 | ||||||||||||||||||||||||||
Webserver monitoring [Nginix, Apache] | YES for all | Apache | Apache | YES for all. | Nginix, Apache, Passenger varnish | Apache, Nginix, Unicorn. | NO | Yes for all | YES for all | NO | YES for all. | Yes for all | ||||||||||||||||||||||||||
Platforms - OS? Linux (unix'es), Windows. | Supports windows, linux, freebsd, etc. | Linux | Linux | Linux, MAC, Windows (soon) | Linux Windows(3) | Linux, Windows, | Linux, Windows | Linux | ALL | Linux | ALL | ALL | ||||||||||||||||||||||||||
Configuration Tool support [Puppet, Chef, Ansible, Salt] | YES for all | Puppet Chef | Puppet, Chef, Ansible, | Yes for all. | Yes for all. | YES for all | NO | Yes for All. | Yes for all | Puppet | ALL | ALL | ||||||||||||||||||||||||||
Deployments: Deployments: servers, VMs, containers, | ALL | ALL | ALL | ALL | ALL | ALL. | ALL | All | ALL | ALL | ALL | ALL | ||||||||||||||||||||||||||
Openstack Modules | NO (2) | NO | ALL. | CEPH, Cinder, Glance, Keystone, Neutron, Nova | NO | NO | NO | NO | YES (All) | NO | NO | NO | ||||||||||||||||||||||||||
Intel PCM and SSDs SMART metrics | NO | NO | NO | YES | NO | NO | NO | NO | NO | NO | NO | NO | ||||||||||||||||||||||||||
Cluster Mgmt. (Kubernetes, Mesos, Swarm) | NO | NO | NO | Kubernetes and Mesos | Kubernetes and mesos | Kubernetes and mesos | NO | Kubernetes and Mesos | YES | NO | YES | YES | ||||||||||||||||||||||||||
Modifiers - (filtering, threshold, tags, contexts)
| Filtering and threshold - yes. Tags - YES. Contexts - No. (1) | NO | YES | YES for all. | Tags, Filtering and threshold. | NO(1) | NO | Tagging | YES | Tags | YES | YES | ||||||||||||||||||||||||||
Dynamic Loading of plugins. | NO | NO | NO | YES | YES | YES. | YES? | NO | YES | NO | YES | YES | ||||||||||||||||||||||||||
Intervals: Lowest Sampling Interval (LSI) - How frequently the plugins can read values from source(s) of truth. | can Network Transmit Interval (NTI). | LSI: can go down to a nano second resolution | Interval for transmitting over the network | NTI: Cannot be specified - depends on size of the buffer and reading interval | ||||||||||||||||||||||||||||||||||
Other Services monitoring: (DHCP, DNS, FTP, NTP, HAProxy, Consul) | HAProxy, DNS, NTP | NO | HAProxy, NTP. | HAProxy | DHCP, HAproxy, NTP, Consul. | YES for all. | NO | HAproxy, NTP, Consul, DNS, | YES | NO | YES (4) | YES(4) |
Legends
...
Legends
(1) This aspect is realized either as a server-side component or by a 'customized' agent.
...
(5) A single value cannot be entered due development of logically-independent modules by different community groups.
[^]: Runtime analysis process and considerations:
- Isolate the CPUs on the monitoring node. [ Add isolcpus option in the grub. CPU0]
- Run the agent on the isolated CPU (CPU0). [ Use taskset command to run agent-processes with appropriate CPU-mask: 0x01]
- Plugins: Configure agent to monitor following metrics - CPU, Memory, Disk, Interface, IPMI, processes, libvirt, Caches, OVS, hugepages.
- Output: Make agent to send metrics over network (Ex: influxdb running on separate node)
- Workload: stress-ng + iperf.
- Monitoring duration: 5 minutes.
- Frequency: 1sec.
- Collect Metrics (using any other tool) to analyze agent's runtime performance [ Ex: Used Snap to collect ‘collectd-process’ metrics and CPU and memory data]
- Note the iperf performance ( to study any effect on it due to collectd]
Inference Questions
View file | ||||
---|---|---|---|---|
|
...