Anuket Project

2021-06-18 AI/ML for NFV Meeting Minutes

Attendees

Sridhar Rao

Rohit Singh Rathaur

Girish

Al Morton


Summary Till-Date:

  1. Survey if completed.
  2. Testbed is assigned - Pod12-Jump
  3. Framework : Acumos (too many issues).
  4. Problem Domain - Failure Prediction
  5. Clear Definition of Failure Prediction - Ongoing.
  6. Existing Models with FP - ARIMA or RNN - Used to deploy and test.
  7. Enhancement to Existing works on FP - Not yet started
  8. Data Gathering: (Important*)
    1. Publicly Available: Searching...
    2. Collecting from existing testbeds: WIP


Sl. No.TopicPresenterNotes
1Framework Deployment Status

Acumos - Container/K8S based approach.

Vanilla deployment - Failure to deploy for both approached (with and without cluster deployment).

  1. Work on Acumos on Pod18 - Existing Cluster - Girish
  2. Work on Other framework on Pod12-Jump - Rohit. Decision on 'other' framework by EoW.
2Survey - Implementation details - Status

Completed 

https://docs.google.com/spreadsheets/d/15XRdrWvbSCPsg1zZ9PfT9yvnElq21AvB/edit#gid=971676644

3Model Deployment Status

Waiting for the Framework to be UP - to run on the testbed.

Currently running locally - Google Collab. (Jupyter Notebooks).

Data: CPU consumption.

Failure: VM.

4Publicly Available Data

To be added by Girish/Rohit:

4Failure Prediction Definition - Status

Existing works:

  1. Mostly VM and Application Failures.
  2. Failure -  Crash and Connectivity

Gaps:

  1. Hardware, Containers
  2. Other failure types aren't considered

How to collect Data:

Take advantage of Chaos Engg Project - Litmus, Pumba, blockade etc.