Introduction
What is the purpose of this paper? Who is the audience? What should the reader get from reading?
Network Function Virtualization (NFV) is the process of moving from the traditional, vertically-integrated Telco stack consisting of dedicated hardware to
an infrastructure based on generic hardware and composed of diverse software elements provided by different vendors. A major goal of NFV is be able to
integrate hardware and software from multiple sources yet achieve the reliability, availability and performance of the traditional approach.
Vertically-integrated systems achieve reliability and availability by providing well-coordinated processes of monitoring and fault recovery. A challenge for
NFV is to provide an equivalent level of reliability and availability while tying together hardware and software that have been developed separately. The
Network Function Virtualization Infrastructure (NFVI) consisting of hardware from multiple sources must by monitored with equivalent effect as traditional systems
with that monitoring information being delivered to the Virtualized Infrastructure Manager (VIM) in sufficient quality and latency such that traditional levels of availability
and reliability can be achieved.
....
NFVI/VIM and NFV
A succinct definition of what is the NFVI/VIM (In the initial paper, the VIM is based upon OpenStack)
What is important in the NFVI/VIM?
General concepts in what is important in maintaining a healthy NFVI/VIM. Not just compute, networks are key. Services? Correlation across metrics, nodes, logs.
What is 50ms and is it Important?
Where did the 50ms requirement come from and is it still relevant? Correlation across nodes...
Monitoring Specifics
Polling frequencies
1 second, 15 seconds, etc… A mix of polling frequencies. Too much data?
What to Monitor?
Metrics
Specific metrics to monitor (cpu,...)
Metrics, plugins and TST008
Events
Specific infrastructure events to monitor (interface link status, etc...)
Services
Specific VIM (OpenStack,SDN,etc...) services to monitor
How to Monitor
Getting the Metrics (Collectd)
Getting the Events (Collectd+Logs+others)
An Example Setup
Monitoring setup, tools, etc…