2021-06-18 AI/ML for NFV Meeting Minutes

Anuket Project

2021-06-18 AI/ML for NFV Meeting Minutes

Attendees

@Sridhar Rao

@Rohit Singh Rathaur

@Girish

@Al Morton

 

Summary Till-Date:

  1. Survey if completed.

  2. Testbed is assigned - Pod12-Jump

  3. Framework : Acumos (too many issues).

  4. Problem Domain - Failure Prediction

  5. Clear Definition of Failure Prediction - Ongoing.

  6. Existing Models with FP - ARIMA or RNN - Used to deploy and test.

  7. Enhancement to Existing works on FP - Not yet started

  8. Data Gathering: (Important*)

    1. Publicly Available: Searching...

    2. Collecting from existing testbeds: WIP

 

Sl. No.

Topic

Presenter

Notes

Sl. No.

Topic

Presenter

Notes

1

Framework Deployment Status

@Girish

@Rohit Singh Rathaur

Acumos - Container/K8S based approach.

Vanilla deployment - Failure to deploy for both approached (with and without cluster deployment).

  1. Work on Acumos on Pod18 - Existing Cluster - Girish

  2. Work on Other framework on Pod12-Jump - Rohit. Decision on 'other' framework by EoW.

2

Survey - Implementation details - Status

@Girish

@Rohit Singh Rathaur

Completed 

https://docs.google.com/spreadsheets/d/15XRdrWvbSCPsg1zZ9PfT9yvnElq21AvB/edit#gid=971676644

3

Model Deployment Status

@Girish

@Rohit Singh Rathaur

Waiting for the Framework to be UP - to run on the testbed.

Currently running locally - Google Collab. (Jupyter Notebooks).

Data: CPU consumption.

Failure: VM.

4

Publicly Available Data

@Girish

@Rohit Singh Rathaur

To be added by Girish/Rohit:

4

Failure Prediction Definition - Status

@Girish

@Rohit Singh Rathaur

Existing works:

  1. Mostly VM and Application Failures.

  2. Failure -  Crash and Connectivity

Gaps:

  1. Hardware, Containers

  2. Other failure types aren't considered

How to collect Data:

Take advantage of Chaos Engg Project - Litmus, Pumba, blockade etc.