Anuket Project

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 41 Next »


Introduction:

Cross-NUMA tests as part of OPNFV Plugfest (Gambia) - January 2019..............

  1. VSPERF-Scenarios: P2P and PVP.
  2. Workloads: vSwitchd, PMDs and VNF.
  3. VNF: L2 Forwarding
  4. vswitch: OVS and VPP.


Testcases Run:

Framesizes: 64, 128, 256, 512, 1024, 1280, 1518

  1. RFC2544 Throughput Test - NDR. 
  2. Continuous traffic Test - 100%

Testbed:

 Intel POD12 

Node-4 (DUT), Node-5 (Software Traffic Generators) and H/W Traffic Generator.


CPU Topology on DUT

P2P Scenarios

Summary of P2P Scenarios:

Scenario

Possible Core-allocations:
Assumptions: Numa-0 (0-21) Numa-1 (22-43)

vSwitch Core #: 02

DUT Ports, TGen (Hardware) Ports

1

PMDs: 4, 5 (0x30)

DUT: eno5, eno6

TGEN: 5, 6

2

PMDs: 22, 23 (0xC00000)

DUT: eno5, eno6

TGEN: 5, 6

3

PMDs: 4, 22 (0x400010)

DUT: eno5, eno6

TGEN: 5, 6

4

PMDs: 4, 5 (0x30)

DUT: eno5, ens801f2

TGEN: 5, 7

5

PMDs: 22, 23 (0xC00000)

DUT: eno5, ens801f2

TGEN: 5, 7

6

PMDs: 4, 22 (0x400010)

DUT: eno5, ens801f2

TGEN: 5, 7

7

PMDs: 4, 5 (0x30)

DUT: ens801f2, ens802f3

TGEN: 7, 8

8

PMDs: 22, 23 (0xC00000)

DUT: ens801f2, ens802f3

TGEN: 7, 8

9

PMDs: 4, 22 (0x400010)

DUT: ens801f2, ens802f3

TGEN: 7, 8

PVP Scenarios

Summary of PVP Scenarios:

Scenario

Possible Core-allocations:

Assumptions: Numa-0 (0-21) Numa-1 (22-43)

vSwitch Core # : 02

DUT Ports

TGen Ports

(Hardware)

1

PMDs: 4, 5, 6, 7

(0xF0)

VNF: 8,9

DUT: eno5, eno6

TGEN: 5, 6

2

PMDs: 4, 5, 6, 7

(0xF0)

VNF: 22, 23

DUT: eno5, eno6

TGEN: 5, 6

3

PMDs: 4, 5, 6, 7

(0xF0)

VNF: 8, 22

DUT: eno5, eno6

TGEN: 5, 6

4

PMDs: 4,5,22,23

(0xC00030)

VNF: 8,9

DUT: eno5, ens801f2

TGEN: 5, 7

5

PMDs: 4,5, 22, 23

(0xC00030)

VNF: 24, 25

DUT: eno5, ens801f2

TGEN: 5, 7

6

PMDs: 4, 5, 22, 23

(0xC00030)

VNF: 8, 24

DUT: eno5, ens801f2

TGEN: 5, 7

7

PMDs: 22, 23, 24, 25

(0x3C00000)

VNF: 26, 27

DUT: ens801f2, ens802f3

TGEN: 7, 8

8

PMDs: 22, 23, 24, 24

(0x3C00000)

VNF: 4,5

DUT: ens801f2, ens802f3

TGEN: 7, 8

9

PMDs: 22, 23, 24, 25

(0x3C00000)

VNFs: 4,26

DUT: ens801f2, ens802f3

TGEN: 7, 8


Results: P2P

RFC2544 Throughput Test Results


Continuous Throughput Test Results (Max Received Frame Rate at 100% of Line rate offered load)

Results: PVP

RFC2544 Throughput Test Results


Continuous Throughput Test Results (Max Received Frame Rate at 100% of Line rate offered load)

PVP Latency Results

Inferences

Theme: What is expected, What is unexpected,  

P2P:

  1. Only the smaller (64 and 128) packet sizes matter. For packets sizes above 128 the throughput performance remains similar.
  2. Scenarios 2 and 7 can be seen as the worst case scenarios with both the PMD-cores running on different NUMA than the NIC. As expected, the performance is consistently low for both scenarios-2 and 7.
  3. Interesting cases are Scenario-3 and Scenario-9.  Here a single pmd-core ends up serving both the NICs. This results in poorer performance than Scenario-2 and 7.
  4. Scenario 1, 6, and 8 can be seen as good cases where each of the NICs are served by single, separate PMD-cores.
  5. When one NIC is served by pmd-core on the same NUMA, whereas the other NIC is served by pmd-core on a different NUMA - Scenarios 4 and 5 - can be seen as average cases with lower performance than 1, 6 and 8 - but not as low as 3, 9, 2, and 7.
  6. There is no difference in performance between continuous and RFC2544-throughput traffic tests.


PVP:

Note: In these scenarios, we ensure there is always at least 1 PMD mapped to a NUMA to which a physical NIC is mapped to. That is, we will not encounter the case of Scenario-2 and 7 of the P2P here. 

  1. Continuous traffic results are more consistent across runs compared to RFC2544-throughput test.
  2. The inconsistency across the runs in RFC2544 cases can be explained by the way the binary-search algorithm works - and, this can be used to argue about the importance of adaptive RFC2544 Binary-search algorithm in virtualized environments. 
  3. Due to cross-numa traffic flow, scenarios 2, 3 and 8, as expected, performs poorer compared to other scenarios.
  4. When the NICs are mapped to both the NUMAs - with pmd-cores also present - the performance is similar across all movements of VNF cores. The scenarios 4, 5 and 6 represent these cases. However, among these, Scenario-6 is relatively poorer as its cores are split across NUMAs, and the chances are that only one of them would be used effectively.
  5. Scenarios 1, 7 and 9 are the best cases - with minimal to none cross-numa effects.


Generic:

  1. X-NUMA instantiation is a very realistic scenario.  If we seek more realism, we might add a stressor load to a few of the interesting scenarios. This might enhance the effects of X-NUMA deloyment.


Observations

PVP Scenarios OVS-PMD and Interfaces (physical and virtual) mappings

ScenarioMappings
1/2/3

pmd thread numa_id 0 core_id 4:
isolated : false
port: dpdkvhostuser1 queue-id: 0
pmd thread numa_id 0 core_id 5:
isolated : false
port: dpdk1 queue-id: 0
pmd thread numa_id 0 core_id 6:
isolated : false
port: dpdk0 queue-id: 0
pmd thread numa_id 0 core_id 7:
isolated : false
port: dpdkvhostuser0 queue-id: 0

4

pmd thread numa_id 0 core_id 4:
isolated : false
port: dpdkvhostuser1 queue-id: 0
pmd thread numa_id 0 core_id 5:
isolated : false
port: dpdk0 queue-id: 0
port: dpdkvhostuser0 queue-id: 0
pmd thread numa_id 1 core_id 22:
isolated : false

pmd thread numa_id 1 core_id 23:
isolated : false
port: dpdk1 queue-id: 0

5

pmd thread numa_id 0 core_id 4:
isolated : false
port: dpdk0 queue-id: 0
pmd thread numa_id 0 core_id 5:
isolated : false

pmd thread numa_id 1 core_id 22:
isolated : false
port: dpdkvhostuser1 queue-id: 0
pmd thread numa_id 1 core_id 23:
isolated : false
port: dpdk1 queue-id: 0
port: dpdkvhostuser0 queue-id: 0

6

pmd thread numa_id 0 core_id 4:
isolated : false
port: dpdkvhostuser1 queue-id: 0
pmd thread numa_id 0 core_id 5:
isolated : false
port: dpdk0 queue-id: 0
port: dpdkvhostuser0 queue-id: 0
pmd thread numa_id 1 core_id 22:
isolated : false

pmd thread numa_id 1 core_id 23:
isolated : false
port: dpdk1 queue-id: 0

7/8/9

pmd thread numa_id 1 core_id 22:
isolated : false
port: dpdkvhostuser1 queue-id: 0
pmd thread numa_id 1 core_id 23:
isolated : false
port: dpdk0 queue-id: 0
pmd thread numa_id 1 core_id 24:
isolated : false
port: dpdkvhostuser0 queue-id: 0
pmd thread numa_id 1 core_id 25:
isolated : false
port: dpdk1 queue-id: 0




Possible Variations

  1. Increase the Number of CPUs to 4 for the VNF.
  2. Phy2phy case (no VNF).
  3. Try different forwarding VNF
  4. Different Virtual Switch (VPP)
  5. RxQ Affinity.


Notes on Documentation

  1. must view log files, qemu threads need to match the intended scenario for VM -
  2. Christian created qemu command (and documentation) - check this for VM mapping
  3. SR: CT's command is only the host
  4. qemu command line -smp 2 should do this - simulates two Numa Nodes  - need to see how the VM see it's architecture: numactl -h
  • No labels