Anuket Project

Collectd Redfish Plugin HLD

Requirement


Name

Description

1.0 

Support Redfish v1.0

Make use of REST API and scheme defined in Redfish standard v1.0

2.0 

Configurable list of endpoints

Plugin configuration shall contain list of queries and list of services.

Each query entry shall contain endpoint and list of resources to be collected.

Each service entry shall contain credentials to gain access and list of queries to perform,

3.0 

Configurable mapping of redfish sensors to collectd metrics

There shall be mapping between redfish sensors to collectd sensors as for units and types.

4.0 

OOB monitoring

Collecting telemetry shall be performed over the network.

5.0Supported metrics of redfish v1.0

Metrics shall be supported:

  • Temperature
  • Power
  • Fan

 

Overview

DMTF’s Redfish is a standard API designed to deliver simple and secure management for converged, hybrid IT and the Software Defined Data Center (SDDC). Both human readable and machine capable, Redfish leverages common Internet and web services standards to expose information directly to the modern tool chain. Delivering both in‐band and out‐of‐band manageability, Redfish continues to expand to address customer use cases and technology for a holistic data center management experience.

Why Redfish?

Until Redfish, interoperable management standards were lacking for modern data center environments. As organizations shift to scale‐ out solutions, legacy standards are insufficient to successfully manage numerous simple and multi‐node servers or hybrid infrastructures. Legacy solutions has problems related to security and HW dependencies, lacking the agnostic nature thats needed for the modern datacenter.

An open industry standard specification and schema, Redfish specifies a RESTful interface and utilizes defined JSON payloads ‐ usable by existing client applications and browser‐based GUI.  

Why REST, HTTP and JSON?

Combining language support with the ubiquity of REST, HTTP and JSON, Redfish enables IT management tasks to be performed using the same skill set and tool chain as other IT and dev/ops tasks.RESTful protocols are rapidly replacing SOAP as the cloud ecosystem is adopting REST, and the web API community has followed suit. RESTful protocols are much quicker to learn than SOAP, and they have the simplicity of being a data pattern (as REST is not strictly a protocol) mapped to HTTP operations directly.

References : 

https://www.dmtf.org/sites/default/files/2017_12_RedfishTechnicalOverview.pdf

https://www.dmtf.org/sites/default/files/2017_12_Redfish_Introduction_and_Overview.pdf

 

Design

Redfish plugin

The  redfish plugin collects information about sensors provided by BMC:

  

Name

Type

Type Instance

Description

Comment

-

Sensor type

Sensor name

Sensor types and sensor names are defined via configuration file (They will be auto-generated from payload in next code drops).

Depends to hardware.

 

Plugin configuration

The following configuration options should be supported by redfish collectd plugin:  

Name

Description

QuerySection defining a query performed on Redfish interface

Endpoint

URI of the REST API Endpoint for accessing the BMC

Resource

Selects single resource or array to collect information.

PropertySelects property from which data is gathered
PluginInstancePlugin instance of collectd metric
TypeType of collectd metric
TypeInstanceType instance of collectd metric
ServiceSection defining service to be sent requests

Username

BMC username

Password 

BMC password

Queries

Queries to run

 

Here is an example of the plugin configuration section of collectd.conf file:

<Plugin redfish> <Query "fans"> Endpoint "/Chassis[0]/Thermal" <Resource "Fans"> <Property "ReadingRPM"> PluginInstance "chassis-1" Type "rpm" </Property> </Resource> </Query> <Query "temperatures"> Endpoint "/Chassis[0]/Thermal" <Resource "Temperatures"> <Property "ReadingCelsius"> PluginInstance "chassis-1" Type "degrees" </Property> </Resource> </Query> <Query "voltages"> Endpoint "/Chassis[0]/Power" <Resource "Voltages"> <Property "ReadingVolts"> PluginInstance "chassis-1" Type "volts" </Property> </Resource> </Query> <Service "local"> Host "127.0.0.1:5000" User "user" Passwd "passwd" Queries "fans" "voltages" "temperatures" </Service> </Plugin>




Implementation details

To support the redfish based sensor monitoring, the standard redfish interfaces are used as mentioned in the DMTF. Idea is to use as much possible the standard REST API URLs to reach the sensor resources. In order to implement and utilize the redfish interfaces one must understand the concept of resource maps in DMTF redfish architecture. 



Following redfish REST query outputs the current sensor status of Fan (for "Fan_SYS4_2" with member id 9) and its corresponding metadata. "ReadingRPM" parameters is the main value of concern here which shows the current Fanspeed.

[mansingh@euca- ~]$ curl -s https://192.168.0.10/redfish/v1/Chassis/1/Thermal/Fans/9 -k -u admin:admin | python -m json.tool 
{
    "@odata.context": "/redfish/v1/$metadata#Chassis/1/Thermal/Fans/Members/$entity",
    "@odata.id": "/redfish/v1/Chassis/1/Thermal/Fans/9",
    "@odata.type": "#Thermal.1.0.0.Thermal",
    "Fans": {
        "@odata.id": "/redfish/v1/Chassis/1/Thermal/Fans/9",
        "FanName": "Fan_SYS4_2",
        "LowerThresholdCritical": "500.000",
        "LowerThresholdFatal": "0.000",
        "LowerThresholdNonCritical": "1000.000",
        "MaxReadingRange": "0xFF",
        "MemberId": 9,
        "MinReadingRange": "0x00",
        "ReadingRPM": "12600.000",
        "RelatedItem": [
            {
                "@odata.id": "/redfish/v1/Systems/1"
            },
            {
                "@odata.id": "/redfish/v1/Chassis"
            }
        ],
        "RelatedItem@odata.count": 2,
        "RelatedItem@odata.navigationLink": "/redfish/v1/Chassis/1/Thermal",
        "SensorNumber": 201,
        "Status": {
            "State": "Enabled"
        },
        "UpperThresholdCritical": "0.000",
        "UpperThresholdFatal": "0.000",
        "UpperThresholdNonCritical": "0.000"
    },
    "Id": "Fans/$entity",
    "Name": "Fans/$entity"
}


Following redfish REST query outputs the current sensor status of temperatures (for Temp_CPU0 with member id 1) and its corresponding metadata. "ReadingCelsius" parameters is the main value of concern here which shows the current temperature.

[mansingh@euca- ~]$ curl -s https://192.168.0.10/redfish/v1/Chassis/1/Thermal/temperatures/1 -k -u admin:admin | python -m json.tool 
{
    "@odata.context": "/redfish/v1/$metadata#Chassis/1/Thermal/Temperatures/Members/$entity",
    "@odata.id": "/redfish/v1/Chassis/1/Thermal/Temperatures/1",
    "@odata.type": "#Thermal.1.0.0.Thermal",
    "Id": "Temperatures/$entity",
    "Name": "Temperatures/$entity",
    "Temperatures": {
        "@odata.id": "/redfish/v1/Chassis/1/Thermal/Temperatures/1",
        "LowerThresholdCritical": "0.000",
        "LowerThresholdFatal": "0.000",
        "LowerThresholdNonCritical": "0.000",
        "MaxReadingRangeTemp": "0xFF",
        "MemberId": 1,
        "MinReadingRangeTemp": "0x00",
        "Name": "Temp_CPU0",
        "PhysicalContext": "",
        "ReadingCelsius": "39.000",
        "RelatedItem": [
            {
                "@odata.id": "/redfish/v1/Systems/1"
            },
            {
                "@odata.id": "/redfish/v1/Chassis"
            }
        ],
        "RelatedItem@odata.count": 2,
        "RelatedItem@odata.navigationLink": "/redfish/v1/Chassis/1/Thermal",
        "SensorNumber": 170,
        "Status": {
            "State": "Enabled"
        },
        "UpperThresholdCritical": "100.000",
        "UpperThresholdFatal": "0.000",
        "UpperThresholdNonCritical": "99.000"
    }
}

Following redfish REST query outputs the current sensor status of power (for "Volt_VR_DIMM_CD"  with member id 11) and its corresponding metadata. "ReadingVolts" parameters is the main value of concern here which shows the current temperature.

[mansingh@euca- ~]$ curl -s https://192.168.0.10/redfish/v1/chassis/1/Power/voltages/11 -k -u admin:admin | python -m json.tool 
{
    "@odata.context": "/redfish/v1/$metadata#Chassis/1/Power/Voltages/Members/$entity",
    "@odata.id": "/redfish/v1/Chassis/1/Power/Voltages/11",
    "@odata.type": "#Power.1.0.0.Power",
    "Id": "Voltage/$entity",
    "Name": "Voltage/$entity",
    "Voltages": {
        "@odata.id": "/redfish/v1/Chassis/1/Power/Voltages/11",
        "LowerThresholdCritical": "1.080",
        "LowerThresholdFatal": "0.000",
        "LowerThresholdNonCritical": "0.000",
        "MaxReadingRange": "0xFF",
        "MemberId": 11,
        "MinReadingRange": "0x00",
        "Name": "Volt_VR_DIMM_CD",
        "PhysicalContext": "VoltageRegulator",
        "ReadingVolts": "1.220",
        "RelatedItem": [
            {
                "@odata.id": "/redfish/v1/Systems/1"
            },
            {
                "@odata.id": "/redfish/v1/Chassis/1/Power"
            }
        ],
        "RelatedItem@odata.count": 2,
        "RelatedItem@odata.navigationLink": "/redfish/v1/Chassis/1/Power",
        "SensorNumber": 221,
        "Status": {
            "State": "Enabled"
        },
        "UpperThresholdCritical": "1.320",
        "UpperThresholdFatal": "0.000",
        "UpperThresholdNonCritical": "0.000"
    }
}


Collectd'd standard framework calls are used to read the configurations details and then make a read callback for specified sensors. Once the sensor values are fetched, they are pushed to the cache. Following state diagram gives an idea on how the sensor values are picked.


Collectd redfish State diagram

List of supported sensors

 The  collectd redfish plugin implementation supports the following types of redfish sensors:

redfish Sensor type

CollectD sensor value type

Collected Sensor Units

Power

voltage

Volts

Temperature

temperature

Celsius

Fan

fanspeed

RPM

 

Considerations

Configuration Considerations

Deployment Considerations

If your platform does not support BMC – this plugin will be unloaded at initialization time.

 

API/GUI/CLI Considerations

Equivalence Considerations

Security Considerations

Alarms, events, statistics considerations

Not all metrics will be reported as not all types of sensors are supported by redfish plugin.

redfish plugin registers to listen for all type of sensor events received from System Event Log (SEL).

Redundancy Considerations

Performance Considerations

Testing Consideration

The timing interval requirement needs to be taken into consideration when conducting tests.

The Tests should be carried out on a system underload as well as a relatively idle system.

 

Other Considerations

Impact

The following table outlines possible impact(s) the deployment of this deliverable may have on the current system.

 

Ref

System Impact Description

Recommendation / Comments

1

 

 

Key Assumptions

The following assumptions apply to the scope specified in this document.

 

Ref

Assumption

Status

1

 

 

Key Exclusions

The following exclusions apply to the scope discussed in this document.

 

Ref

Exclusion

Status

1

 

 

Key Dependencies

The following table outlines the key dependencies associated with this deliverable.

 

Ref

Dependency

Status

1


 


<Plugin redfish>

  <Query "fans">

    Endpoint "/Chassis[0]/Thermal"

    <Resource "Fans">

      <Property "ReadingRPM">

        PluginInstance "chassis-1"

        Type "rpm"

      </Property>

    </Resource>

  </Query>

  <Query "temperatures">

    Endpoint "/Chassis[0]/Thermal"

    <Resource "Temperatures">

      <Property "ReadingCelsius">

        PluginInstance "chassis-1"

        Type "degrees"

      </Property>

    </Resource>

  </Query>

  <Query "voltages">

    Endpoint "/Chassis[0]/Power"

    <Resource "Voltages">

      <Property "ReadingVolts">

        PluginInstance "chassis-1"

        Type "volts"

      </Property>

    </Resource>

  </Query>

  <Service "local">

    Host "127.0.0.1:5000"

    User "user"

    Passwd "passwd"

    Queries "fans" "voltages" "temperatures"

  </Service>

</Plugin>