Anuket Project

Hugepages Plugin High Level Design

Requirements

 

1.0

Report the number of used and free hugepages on a system per socket and per node.

 

2.0

Create a MIB for hugepages stats

 

 Overview

Virtual memory makes it easy for several processes to share memory [0 ].

Each process has its own virtual address space, which is mapped to physical memory by the operating system” [1]. The process views the virtual memory address space as a contiguous/linear address space, but in reality – the virtual addresses need to be mapped to physical addresses, this is typically done by the Memory Management Unit (MMU) on the CPU.

“There are two ways to enable the system to manage large amounts of memory:

  • Increase the number of page table entries in the hardware memory management unit
  • Increase the page size/use huge pages à to reduce the number of lookups

The first method is expensive, since the hardware memory management unit in a modern processor only supports hundreds or thousands of page table entries. Additionally, hardware and memory management algorithms that work well with thousands of pages (megabytes of memory) may have difficulty performing well with millions (or even billions) of pages. This results in performance issues: when an application needs to use more memory pages than the memory management unit supports, the system falls back to slower, software-based memory management, which causes the entire system to run more slowly.

Huge pages are blocks of memory that come in 2MB and 1GB sizes. The page tables used by the 2MB pages are suitable for managing multiple gigabytes of memory, whereas the page tables of 1GB pages are best for scaling to terabytes of memory” [2]

More info on virtual memory and TLB lookups can be found:

  1. http://www.tldp.org/LDP/tlk/mm/memory.html
  2. https://lwn.net/Articles/253361/

 

Design

Hugepages plugin

The purpose of this feature is to report the number of free and used huge pages on a system for intelligent workload placement. Huge pages can be used by applications to improve performance such as the memory backing for virtual machines. They are allocated per socket across a platform through configuration files or systemctl. Huge pages help improve the performance of applications by reducing the TLB lookups as the page size is increased from 4KB to 2MB or 1GB. Please note, 1GB huge pages must be configured through grub so they are allocated at boot up time.

The collectd plugin itself descends into each huge pages directories:

  1. Huge pages across the system can be found in the files here: /sys/kernel/mm/hugepages/hugepages-<size_in_KB>kB/
  2. Huge pages per NUMA node can be found in the files here: /sys/devices/system/node/node<NUMA_node_number>/hugepages/hugepages-<size_in_KB>kB/

Inside each of these directories, the same set of files will exist:

  • nr_hugepages
  • nr_hugepages_mempolicy
  • nr_overcommit_hugepages
  • free_hugepages
  • resv_hugepages
  • surplus_hugepages.

The plugin reads the free_hugepages, surplus and  nr_hugepages files under the directories listed above, determines the used pages (nr_hugepages + surplus pages - free_hugepages) and reports the free and used huge pages on a per node level or what’s available across the system (per hugepage size), in terms of:

  1. The number
  2. The size in bytes
  3. The percentage

 

Considerations

Configuration Considerations

To collect hugepages information, the collectd huge pages plugin reads directories /sys/devices/system/node/*/hugepages" and "/sys/kernel/mm/hugepages” to retrieve the number of free, surplus and used hugepages. Values reported are in pages, bytes or percent. Configuration options include:

ReportPerNodeHP true|false

If enabled, information will be collected from the huge page counters in "/sys/devices/system/node/*/hugepages". This is used to check the per-node huge page statistics on a NUMA system.

ReportRootHP true|false

If enabled, information will be collected from the huge page counters in "/sys/kernel/mm/hugepages". This can be used on both NUMA and non-NUMA systems to check the overall huge page statistics.

ValuesPages true|false

Whether to report huge pages metrics in number of pages. Defaults to true.

ValuesBytes false|true

Whether to report huge pages metrics in bytes. Defaults to false.

ValuesPercentage false|true

Whether to report huge pages metrics as percentage. Defaults to false.

Deployment Considerations

Huge pages will be configured by the VIM in a NFV environment.

API/GUI/CLI Considerations

Equivalence Considerations

A new MIB will need to be created to support the reporting of huge pages statistics through SNMP.

Security Considerations

Alarms, events, statistics considerations

Redundancy Considerations

Performance Considerations

Testing Consideration

Tests should configure and test 2MB, 1GB huge pages and a mix of the two.

The Tests should be carried out on a system underload by applications which use and free hugepages regularly as well as a relatively idle system.

Other Considerations

Impact

The following table outlines possible impact(s) the deployment of this deliverable may have on the system.

 

Ref

System Impact Description

Recommendation / Comments

1

The ability to monitor and reallocate huge pages for intelligent workload placement will be important in a NFV environment, for DPDK based applications and for VNFs.

Reallocation is the functionality of the VIM/MANO at a policy level – not the monitoring application

Key Assumptions

None.

Key Exclusions

None.

Key Dependencies

None.

Issues List

None.

References

[0] http://www.tldp.org/LDP/tlk/mm/memory.html

[1] https://www.kernel.org/doc/gorman/html/understand/understand007.html

[2] https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-transhuge.html