Anuket Project
3.11 Sustainability and Energy Optimization
One important aspect on building new and modern telecom infrastructure is to reduce energy costs which today comprise of almost ~40% of their CAPEX and with expansion of new 5G and Edge solutions in Enterprises and RAN it is expected to increase .
The biggest challenge exists as still today most of telecom infrastructure is designed for Busy hours and is not flexbiel enough to tune and optimize to give best compromise between performance and energy use . It is therefore Energy efficiency is becoming critical in all domains including Cloud infrastructure .
Energy Efficiency (EE) is defined as "the relation between the useful output and energy consumption" by ITU-T L.1330 :cite:p:`itutl1330` for telecommunication networks and ETSI EN 303 471 :cite:p:`etsien303sp471` for NFV, the useful output being a metric which represents the capacity provided by the service whose energy efficiency is assessed.
As an example, the useful output of a traffic forwarding function can be the data volume forwarded (e.g., measured in Byte) and the assessment of its energy efficiency is then based on the ratio between this volume and the energy consumed for processing it (e.g., measured in Watt.hour) : Energy Efficicency (B/Wh) = Traffic Volume / Consumed Energy
The method for assessing energy efficiency depends on the service targeted and the objectives. For NFV, ETSI proposes a method for production environment in ETSI EN 303 471 :cite:p:`etsien303sp471` and another one for laboratory one in ETSI ES 203 539 :cite:p:`etsies203sp539` (which is a common work with ITU-T which published as ITU-T L.1361 :cite:p:`itutl1361`).
Whatever the method and the service, it requires the cloud infrastructure to provide some energy consumption metrics for different parts of the infrastructure hardware (server, CPU etc.) as included in :ref:`chapters/chapter04:internal performance measurement capabilities`. These metrics can be an amount of consumed energy (measured in Joule or Watt.hour) or a real-time power utilisation (measured in Watt or Joule/second) as proposed by DMTF Redfish DSP0268 2022.2 :cite:p:`dmtfredfish` which specifies metrics EnergykWh and PowerWatts for this purpose.
Some relevant information regarding NFV energy efficiency can also be found in Open RAN Technical Priority - Focus on Energy Efficiency (March 2022) :cite:p:`oranenergyeff` and QuEST Forum - NFV Workload Efficiency Whitepaper (October 2016) :cite:p:`questnfvwlenergyeff`.
Sustainability , green energy and optimizing the efficiency of Telecom Networks is key requirement for all current and ongoing Network transformations and it should be enabled on each layer of the Network stating from Hardware infrastructure , Cloud and CaaS Systems , Applications and above all use of Data and ML/AI to optimize the energy use in real time
The key aspect of Optimizing energy efficiency is to enable smart observability which means that KPI are available at each layer and further they can be co-related across a closed loop system to intelligently monitor and optimize the energy use
Energy Efficiency Domains | Example | KPI Metrics |
---|---|---|
Hardware | New Hardware Layer innovations | https://github.com/anuket-project/anuket-specifications/blob/master/doc/ref_model/chapters/chapter03.rst#id79 |
Cloud domain | Support new features on energy saving measures in Cloud | |
ML/AI and Data analytics | Intelligence and Orchestration | Use if SMO , RIC and ML/AI based decisions |
Applications | Telecom Applications refactoring |
3.11.1 Energy Savings in Hardware:
Hardware and Physical infrastructure has the biggest role in improving the Energy efficiency (EE) in Telecom systems , all of the related power related KPI should be available in the Hardware metrics defined
It is also important to design the Cloud infrastructure considering the best permutations that gives the most optimum output from Energy perspective . As an example design a workload with maximum use of GPU and L1 Accelerators
- Use of GPU can enhanced energy efficiency through ML/AI
- Use of DPU can optimize energy efficiency by offloading all infrastructure services mainly storage , networking and security .
- Use of Optimal SIlicon and architectures e.g X86 vs ARM can deliver substantial cost savings
- Use of High voltages e.g 380v DC to reduce overall Power efificinecy
- Liquid cooling and other passive infrastructure innovations
Table 3-8.1 Energy Monitoring shows possible performance measurement capabilities for the Hardware Infrastructure. These measurements or events should be collected and monitored by monitoring tools.
Ref | Hardware Infrastructure Capability | Unit | Definition/Notes |
---|---|---|---|
hardwaremon.001 | Host CPU usage | Watt (Joule/s) | Real-time electrical power used by the processor(s) of a node (1) |
hardwaremon.002 | NIC ,Accelerator cards , Fan and Power supplies energy use | Watt (Joule/s) | Real-time electrical power used by the processor(s) of a node (1) |
hardwaremon.003 | GPU energy use | Watt (Joule/s) | Real-time electrical power used by the GPU |
hardwaremon.004 | DPU energy use | Watt (Joule/s) | Real-time electrical power used by the DPU |
hardwaremon.005 | eASIC energy use | Watt (Joule/s) | Real-time electrical power used by the eASIC |
Table 3-8.1 Energy Optimization possible performance measurement capabilities for the Hardware Infrastructure.
Ref | Hardware Infrastructure Capability | Unit | Definition/Notes |
---|---|---|---|
hardwareopt.001 | C-states i.e C-0 ,C-1 ,C-2 ,TC-Custom availability | Real-time electrical power used by the processor(s) of a node (1) | |
hardwareopt.002 | Capability to free up and control sleep mode/C-state/P-state of specific hardware resources (e.g. server, CPU) | dynamic reallocation of active workload to specific accelerators and/or CPUs | |
3.11.2 Energy Savings in Cloud layer:
As Telco's Cloud infrastructure will comprise of multi clouds and different permutations on VIM and CaaS Layers it becomes important that the Cloud management solutions make best use of Energy efficiency measures provided by the hardware and ensures a optimally balanced infrastructure.
Following are the main characteristics from Cloud management layer to improve energy efficiency
- Cloud architecture supports optimum load balancing
- Cloud layer supports resource optimization by adequately report the under utilized infrastructure
- Cloud supports the elastic and efficient shutdown of spare and un-used infrastructure , one key characteristic is abstract the physical resources from the application
Internal Performance Measurement Capabilities
Table 3-8.2 shows possible performance measurement capabilities for the Hardware Infrastructure. These measurements or events should be collected and monitored by monitoring tools.
Ref | Cloud Infrastructure Capability | Unit | Definition/Notes |
---|---|---|---|
cloud.001 | VM and Container energy use | Watt (Joule/s) | Real-time electrical power aggregated at VM or Container level |
cloud.002 | NFVI energy use | Watt (Joule/s) | Real-time electrical power aggregated at NFVI level |
cloud.003 | Load balancing across all cloud resources | Equal % | Load balance across cloud infastructure |
cloud.004 | Support for all energy KPI to NBI and OSS systesm | - | Integrate to NBI systems |
cloud.005 | Indirect power reduction by means of energy aware traffic steering | ||
cloud.006 | Software sustainability check @chris wright HOL for this |
3.11.3 Energy Savings through Orchestration and Intelligence:
Internal Performance Measurement Capabilities
Table 3-8.2 shows possible performance measurement capabilities for the Orchestration and Intelligence Layer
Ref | Orchestration and Intelligence Capability | Unit | Definition/Notes |
---|---|---|---|
orchestration.001 | Data collection and visibility for all Energy KPI in real time | Infrastructure should enable new data architectures and orchestration capabilities | |
cloud.002 | Real time tuning of hardware | ||
cloud.003 | |||
cloud.004 | |||
3.11.4 Energy Savings in Applications:
Most of the Telecom applications currently are seldom designed by keeping energy KPI's and measures in place . With 5G Release17 there had been a lot of work and emphasis on Energy optimizations as can be refereed in
Table 3-8.2 shows possible 3GPP work on Energy Efficiency
Specification | Name | comments |
---|---|---|
3GPP TS 28.301 | Energy Efficiency of 5G systems | |
3GPP TS 28.552 | 5G Performance Measurements | |
3GPP TS 28.813 | New aspects of Energy Efficiency for 5G |
Table 3-8.2 shows possible performance measurement capabilities for the applications
Ref | Cloud Infrastructure Capability | Unit | Definition/Notes |
---|---|---|---|
app.001 | Host CPU usage | Watt (Joule/s) | Real-time electrical power used by the processor(s) of a node (1) |
app.002 | VNFC level Energy utilization | Watt (Joule/s) | |
app.003 | VNF level Energy utilization | Watt (Joule/s) | |
app.002 | Network slice EE KPI | performance of slice /Watt | as defined in 3GPP TS28.310 |
app.003 | eMBB slice EE KPI | Gbps /Watt | as defined in 3GPP TS28.310 |