Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page contains an analyzis on the list of test cases listed in the CNCF CNF Testsuite to determine if RA2 should contain related workload requirements.

...

Test that the CNF does not crash when disk fill occurs)


Issues raised to CNCF CNF Testsuite during this work

The analyzis


TestNoteVerdict
To test the increasing and decreasing of capacity

Do we request horizontal scaling from all CNF-s?

Most (data plane, signalling, etc) but not all (eg OSS)

should be optional, or just fail if it scales incorrectly in case the CNF scales
Test if the Helm chart is published
At the moment RA2 does not mandate the usage of Helm.
We should first decide on CNF packaging. RA2 can stay neutral, follow the O-RAN/ONAP ASD path or propose own solution.
should be fine - no HELM specs in RA2 today, unless some incompatible CNFs packaging specs (unlikely)


Test if the Helm chart is valid
At the moment RA2 does not mandate the usage of Helm.
Test if the Helm deploys
At the moment RA2 does not mandate the usage of Helm.
This should be more generic, like testing if the CNF deploys.
Test if the install script uses Helm v3
At the moment RA2 does not mandate the usage of Helm.
To test if the CNF can perform a rolling update
As there's some CNFs that actually use rolling update without keeping the service alive (because they require some post-configuration), the test should make sure that there is service continuity. this might just be a health probe or testing the k8s service, or something sufficiently straightforward. In other words, CNF service/traffic should work during the whole process (before during and after a rolling upgrade)Needed
To check if a CNF version can be downgraded through a rolling_version_change

It is not clear what is the difference between a rolling upgrade downgrade and a rolling version change.

Maybe when you request an arbitrary version?


To check if a CNF version can be downgraded through a rolling_downgrade
Same as above?Needed
To check if a CNF version can be rolled back rollback
It is not clear what is the difference between a rolling downgrade and a rolled back rollback.
To check if the CNF is compatible with different CNIs

This covers only the default CNI, does not cover the metaplugin part.

Need additional tests for cases with multiple interfaces.

Ok but needs additional tests for multiple interfaces
(PoC) To check if a CNF uses Kubernetes alpha APIs
Alpha API-s are not recommended by ra2.k8s.012. It is not clear what is the OK criteria of this test.Ok if fails with alpha
To check if the CNF has a reasonable image size
It passes if the image size is smaller than 5GB.Ok but should be documented or configurable?
To check if the CNF have a reasonable startup time
It is not clear what reasonable startup time isOk but should be documented or configurable?
To check if the CNF has multiple process types within one container
Containers in the CNF should have only one process type.What's the rationale?

To check if the CNF exposes any of its containers as a service

Service type what?

RA2 mandates that clusters must support Loadbalancer and ClusterIP, and should support Nodeport and ExternalName

Should there be a test for the CNF to use Ingress or Gateway objects as well ?

May need tweaking to add Ingress?

To check if the CNF has multiple microservices that share a database

Clarify rationale? In some cases it is good for multiple Microservices to share a DB, eg when restoring the state of a transaction from a failed service.

Also good to have a shared DB across multiple services for things like HSS etc.

Clarify
Test if the CNF crashes when node drain and rescheduling occurs. All configuration should be stateless

CNF should react gracefully (no loss of context/sessions/data/logs & service continues to run) to eviction and node draining

The statelessness test should be made independent & Should be skipped for stateful pods eg Dns

Needed - but replace "crash" with "react gracefully" (no loss of context/sessions/data/logs & service continues to run)

Statelessness test should be separate

To test if the CNF uses a volume host path

should pass if the cnf doesn't have a hostPath volume

What's the rationale?


To test if the CNF uses local storage

should fail if local storage configuration found

What's the rationale?


To test if the CNF uses elastic volumes

should pass if the cnf uses an elastic volume

What's an elastic volume? Does this mean Ephemeral? Or is this an AWS-specific test?

There should be a definition of what an elastic volume is (besides ELASTIC_PROVISIONING_DRIVERS_REGEX)

What's an elastic volume? Does this mean Ephemeral? Or is this an AWS-specific test?

To test if the CNF uses a database with either statefulsets, elastic volumes, or both

A database may use statefulsets along with elastic volumes to achieve a high level of resiliency. Any database in K8s should at least use elastic volumes to achieve a minimum level of resilience regardless of whether a statefulset is used. Statefulsets without elastic volumes is not recommended, especially if it explicitly uses local storage. The least optimal storage configuration for a database managed by K8s is local storage and no statefulsets, as this is not tolerant to node failure.

There should be a definition of what an elastic volume is (besides ELASTIC_PROVISIONING_DRIVERS_REGEX)

What's an elastic volume? Does this mean Ephemeral? Or is this an AWS-specific test?

Test if the CNF crashes when network latency occurs

How is this tested? Where is the test running? Some traffic against a service? Latency should be configurable (default is 2s)?

What should happen if latency is exceeded? Should this be more stringent than "not crashing?"

What is the expectation?

Needed but needs clarification
Test if the CNF crashes when disk fill occurs

Needed
Test if the CNF crashes when pod delete occurs

Needed
Test if the CNF crashes when pod memory hog occurs


Test if the CNF crashes when pod io stress occurs

Needed
Test if the CNF crashes when pod network corruption occurs
It is not clear what network corruption is in this context.
Test if the CNF crashes when pod network duplication occurs
It is not clear what network duplication is in this context.
To test if there is a liveness entry in the Helm chart
Liveness probe should be mandatory, but RA2 does not mandate Helm at the moment.
To test if there is a readiness entry in the Helm chart
Readiness probe should be mandatory, but RA2 does not mandate Helm at the moment.
To check if logs are being sent to stdout/stderr


To check if prometheus is installed and configured for the cnf
There is a chapter for Additional required components (4.10), but without any content.
To check if logs and data are being routed through fluentd
There is a chapter for Additional required components (4.10), but without any content.
To check if Open Metrics is being used and or compatible.
There is a chapter for Additional required components (4.10), but without any content.
To check if tracing is being used with Jaeger
There is a chapter for Additional required components (4.10), but without any content.
To check if a CNF is using container socket mounts
Make sure to not mount /var/run/docker.sock, /var/run/containerd.sock or /var/run/crio.sock on the containersNeeded
To check if containers are using any tiller images


To check if any containers are running in privileged mode


To check if a CNF is running services with external IP's


To check if any containers are running as a root user


To check if any containers allow for privilege escalation


To check if an attacker can use a symlink for arbitrary host file system access


To check if there are service accounts that are automatically mapped


To check if there is a host network attached to a pod


To check if there are service accounts that are automatically mapped


To check if there is an ingress and egress policy defined


To check if there are any privileged containers


To check for insecure capabilities


To check for dangerous capabilities


To check if namespaces have network policies defined


To check if containers are running with non-root user with non-root membership


To check if containers are running with hostPID or hostIPC privileges


To check if security services are being used to harden containers


To check if containers have resource limits defined


To check if containers have immutable file systems


To check if containers have hostPath mounts


To check if containers are using labels


To test if there are versioned tags on all images using OPA Gatekeeper


To test if there are any (non-declarative) hardcoded IP addresses or subnet masks


To test if there are node ports used in the service configuration


To test if there are host ports used in the service configuration


To test if there are any (non-declarative) hardcoded IP addresses or subnet masks in the K8s runtime configuration


To check if a CNF version uses immutable configmaps


Test if the CNF crashes when pod dns error occurs






...