Page Comparison

...

support for multi PMU uncore events

1.0	Use Linux perf interface to collect data about performance events on a per core basis
2.0	Use jevents library (PMU tools)
3.0	Report hardware cache events, kernel PMU events, software events, hardware specific events	4.0	Should have a configurable interval
54.0	Should have configurable hardware specific events list
65.0	Provide SNMP support for any collectd values, through an PMU MIB7multi PMU uncore events
6.0	Provide	option to choose all the events from json event list file

Overview

Performance counters are CPU hardware registers that count hardware events such as instructions executed, cache-misses suffered, or branches mispredicted. They form a basis for profiling applications to trace dynamic control flow and identify hotspots. Linux perf interface provides rich generalized abstractions over hardware specific capabilities.

...

The intel_pmu plugin collects information provided by Linux perf interface. It is not done directly, but through jevents API. Using this interface, the intel_pmu plugin should collect collects the following metrics:

...

Name

...

Type

...

Type Instance

...

Description

...

cpu-cycles

...

counter

...

cpu-cycles

...

instructions

...

counter

...

instructions

...

cache-references

...

counter

...

cache-references

...

cache-misses

...

counter

...

cache-misses

...

Branches

...

counter

...

Branches

...

branch-misses

...

counter

...

branch-misses

...

bus-cycles

...

counter

...

bus-cycles

...

L1-dcache-loads

...

counter

...

L1-dcache-loads

...

L1-dcache-load-misses

...

counter

...

L1-dcache-load-misses

...

L1-dcache-stores

...

counter

...

L1-dcache-stores

...

L1-dcache-store-misses

...

counter

...

L1-dcache-store-misses

...

L1-dcache-prefetches

...

counter

...

L1-dcache-prefetches

...

L1-dcache-prefetch-misses

...

counter

...

L1-dcache-prefetch-misses

...

L1-icache-loads

...

counter

...

L1-icache-loads

...

L1-icache-load-misses

...

counter

...

L1-icache-load-misses

...

L1-icache-prefetches

...

counter

...

L1-icache-prefetches

...

L1-icache-prefetch-misses

...

counter

...

L1-icache-prefetch-misses

...

LLC-loads

...

counter

...

LLC-loads

...

LLC-load-misses

...

counter

...

LLC-load-misses

...

LLC-stores

...

counter

...

LLC-stores

...

LLC-store-misses

...

counter

...

LLC-store-misses

...

LLC-prefetches

...

counter

...

LLC-prefetches

...

LLC-prefetch-misses

...

counter

...

LLC-prefetch-misses

...

dTLB-loads

...

counter

...

dTLB-loads

...

dTLB-load-misses

...

counter

...

dTLB-load-misses

...

dTLB-stores

...

counter

...

dTLB-stores

...

dTLB-store-misses

...

counter

...

dTLB-store-misses

...

dTLB-prefetches

...

counter

...

dTLB-prefetches

...

dTLB-prefetch-misses

...

counter

...

dTLB-prefetch-misses

...

iTLB-loads

...

counter

...

iTLB-loads

...

iTLB-load-misses

...

counter

...

iTLB-load-misses

...

branch-loads

...

counter

...

branch-loads

...

branch-load-misses

...

counter

...

branch-load-misses

...

cpu-clock

...

counter

...

cpu-clock

...

task-clock

...

counter

...

task-clock

...

context-switches

...

counter

...

context-switches

...

cpu-migrations

...

counter

...

cpu-migrations

...

page-faults

...

counter

...

page-faults

...

minor-faults

...

counter

...

minor-faults

...

major-faults

...

counter

...

major-faults

...

alignment-faults

...

counter

...

alignment-faults

...

emulation-faults

...

counter

...

emulation-faults

hardware specific metrics defined in event list file which should contain definitions of PMU events. The list of events to monitor is configurable.

Image Added

Plugin configuration

The following configuration options should be supported by intel_pmu collectd plugin:

Name	Description	Comment
Interval	The interval within which to retrieve statistics on monitored events in seconds	Interval option is supported by collectd and is defined in <LoadPlugin> block. No additional functionality should be developed in intel_pmu plugin to support this option.	ReportHardwareCacheEvents	Enable/disable monitoring of hardware cache events	ReportKernelPMUEvents	Enable/disable monitoring of kernel PMU events	ReportSoftwareEvents	Enable/disable monitoring of software vents
EventList	Path to hardware events list file for current CPU.	File can be downloaded by event_download.py script which is part of pmu-tools package.
HardwareEvents	String containing comma separated list of hardware specific events to monitor	"All" can be used to set all events from Event List.
Cores	Core groups definition. Monitored metrics are reported only for configured cores. If this option is omitted all available cores are monitored. If a group is enclosed in square brackets each core is added individually to a separate group (that is statistics are not aggregated).	Allowed formats: "0,1,2,3" "0-3" "[0-3]"
DispatchMultiPmu	Enable/disable dispatching of cloned multi PMU for uncore events. If disabled only total sum is dispatched as single event. If enabled separate metric is dispatched for every counter.	Uncore event example: UNC_CHA_DIR_LOOKUP.NO_SNP. If enabled information about event type is added to type_instance, e.g.: "UNC_CHA_DIR_LOOKUP.NO_SNP:type=30". It allows to distinguish between multiple counters for one event.

Here is an example of the plugin configuration section of collectd.conf file:

  <Plugin intel_pmu>

    ReportHardwareCacheEvents true

    ReportKernelPMUEvents true

    ReportSoftwareEvents true

    EventList "/var/cache/pmu/GenuineIntel-6-55-core.json"

    HWSpecificEventsHardwareEvents "L2_RQSTS.CODE_RD_HIT,L2_RQSTS.CODE_RD_MISS" "L2_RQSTS.ALL_CODE_RD"

    Cores ""
    HardwareEvents "L2_RQSTS.PF_MISS"
    Cores "1"
    DispatchMultiPmu false
  </Plugin>

In above example events L2_RQSTS.CODE_RD_HIT,L2_RQSTS.CODE_RD_MISS and L2_RQSTS.ALL_CODE_RD are going to be monitored on
all available cores, and event L2_RQSTS.PF_MISS is going to be monitored on core 1.

Another example with only uncore events set:

  <Plugin intel_pmu>

    EventList "/var/cache/pmu/GenuineIntel-6-55-uncore.json"

    HardwareEvents "UNC_CHA_TOR_INSERTS.IA_MISS:config1=0x4043200000000" "UNC_IIO_TXN_REQ_BY_CPU.MEM_WRITE.PART0"

    Cores "0" "18"
    DispatchMultiPmu false

  </Plugin>

Implementation Implementation details

intel_pmu plugin does not introduce its own layer of functionality. It just reads configuration provided by user and prepares all needed parameters/data structures for jevents API. This table shows the correspondence between plugin’s API and jevents API that is used to configure Linux perf monitoring.

...

plugin API	jevents API	Description
pmu_config		Parse events groups to monitor provided by user in collectd.conf
pmu_init	alloc_eventlist	Allocate memory for new eventlist
	resolve_event_extra	Resolve hardware specific events names to perf events (perf_event_attr)
	jevent_pmu_uncore	Check if event is uncore event
	jevent_next_pmu	Expand event into multiple PMU if neccessary (in use for uncore events)
	setup_event	Setup perf events for monitoring
pmu_read	read_all_events	Read values of all monitored events
pmu_shutdown	free_eventlist	Free memory allocated for eventlist (recursively including all events)

For more details on plugin API see collectd plugin implementation guide https://collectd.org/wiki/index.php/Plugin_architecture.

Hardware Specific Events

In addition to standard groups of events supported by Linux perf (hardware cache, kernel pmu, software) The intel_pmu plugin allows to monitor hardware specific events. To support this functionality plugin will use feature provided by jevents library – resolving symbolic event names using downloaded event files. To be able to use hardware specific event names in configuration file user will have to download events list file for current CPU before using intel_pmu plugin. This can be done using event_download.py script which is part of pmu-tools package.

Note: For uncore events values can be collected only for first core of every socket e.g. '0' '18' etc.

Time based multiplexing

If there are more events than counters, the kernel uses time multiplexing to give each event a chance to access the monitoring hardware. With multiplexing, an event is not measured all the time. At the end of the run, the tool scales the count based on total time enabled vs time running. The actual formula is:
scaled_count = raw_count * time_enabled / time_running

This provides an estimate of what the count would have been, had the event been measured during the entire run. Note that this is an estimate not an actual count. Depending on the workload, there will be blind spots which can introduce errors during scaling.
The plugin dispatches all the four values, that is scaled, raw, time enabled & running, to the user. The values type is COUNTER.

SNMP Support

All metrics collected by intel_pmu plugin should be available through SNMP. This will be achieved by creating proper configuration for snmp_agent collectd plugin. No additional functionality needed in intel_pmu plugin to support SNMP. See description of SNMP feature for more details on snmp_agent plugin.

...

Configuration Considerations

When using intel_pmu plugin number of reading threads in collectd should be increased. The value should be more than a half of configured cores, so for
60 monitored cores the recommendation is to set ReadThreads > 30 (e.g. 35).

Deployment Considerations

...

The Tests should be carried out on a system underload under load as well as a relatively idle system.

...

Ref	Dependency	Status
1	libjevents
2Net-SNMP
3
4

Versions Compared

Old Version 5

New Version Current

Key

Overview

Plugin configuration

Implementation Implementation details

Hardware Specific Events

Time based multiplexing

SNMP Support

Configuration Considerations

Deployment Considerations