Skip to content

Latest commit

 

History

History
111 lines (96 loc) · 6.25 KB

qos_intro.adoc

File metadata and controls

111 lines (96 loc) · 6.25 KB

Introduction

Quality of Service (QoS) is defined as the minimal end-to-end performance that is guaranteed in advance by a service level agreement (SLA) to a workload. A workload may be a single application, a group of application, a virtual machine, a group of virtual machines, or a combination of those. The performance may be measured in the form of metrics such as instructions per cycle (IPC), latency of servicing work, etc.

Various factors such as the available cache capacity, memory bandwidth, interconnect bandwidth, CPU cycles, system memory, etc. affect the performance in a computing system that runs multiple workloads concurrently. Furthermore, when there is arbitration for shared resources, the prioritization of the workloads requests against other competing requests may also affect the performance of the workload. Such interference due to resource sharing may lead to unpredictable workload performance cite:[PTCAMP].

When multiple workloads are running concurrently on modern processors with large core counts, multiple cache hierarchies, and multiple memory controllers, the performance of a workload becomes less deterministic or even non-deterministic. This is because the performance depends on the behavior of all the other workloads in the machine that contend for the shared resources leading to interference. In many deployment scenarios such as public cloud servers the workload owner may not be in control of the type and placement of other workloads in the platform.

System software can control some of these resources available to the workload, such as the number of hardware threads made available for execution, the amount of system memory allocated to the workload, the number of CPU cycles provided for execution, etc.

System software needs additional tools to control interference to a workload and thereby reduce the variability in performance experienced by one workload due to other workload’s cache capacity usage, memory bandwidth usage, interconnect bandwidth usage, etc. through a resource allocation capability. The resource allocation capability enables system software to reserve capacity and/or bandwidth to meet the performance goals of the workload. Such controls enable improving the utilization of the system by collocating workloads while minimizing the interference caused by one workload to another cite:[HERACLES].

Effective use of the resource allocation capability requires hardware to provide a resource monitoring capability by which the resource requirements of a workload needed to meet a certain performance goal can be characterized. A typical use model involves profiling the resource usage of the workload using the resource monitoring capability and to establish resource allocations for the workload using the resource allocation capability.

The allocation may be in the form of capacity or bandwidth depending on the type of resource. For caches, TLBs, and directories, the resource allocation is in the form of storage capacity. For interconnects and memory controllers the resource allocation is in the form of bandwidth.

Workloads generate different types of accesses to shared resources. For example, some of the requests may be for accessing instructions and others may be to access data operated on by the workload. Certain data accesses may not have temporal locality whereas others may have a high probability of reuse. In some cases it is desirable to provide differentiated treatment to each type of access by providing unique resource allocation to each access type.

RISC-V Capacity and Bandwidth Controller QoS Register Interface (CBQRI) specification specifies:

  1. QoS identifiers to identify workloads that originate requests to the shared resources. These QoS identifiers include an identifier for resource allocation configurations and an identifier for the monitoring counters used to monitor resource usage. These identifiers accompany each request made by the workload to the shared resource. [QOS_ID] specifies the mechanism to associate the identifiers with workloads.

  2. Access-type identifiers to accompany request to access a shared resource to allow differentiated treatment of each access-type (e.g., code vs. data, etc.). The access-types are defined in [QOS_ID].

  3. Register interface for capacity allocation in controllers such as shared caches, directories, etc. The capacity allocation register interface is specified in [CC_QOS].

  4. Register interface for capacity usage monitoring. The capacity usage monitoring register interface is specified in [CC_QOS].

  5. Register interface for bandwidth allocation in controllers such as interconnect and memory controllers. The bandwidth allocation register interface is specified in [BC_QOS].

  6. Register interface for bandwidth usage monitoring. The bandwidth usage monitoring register interface is specified in [BC_QOS].

The capacity and bandwidth controller register interfaces for resource allocation and usage monitoring are defined as memory-mapped registers. Each controller that supports CBQRI provides a set of registers that are located at a range of physical address space that is a multiple of 4-KiB and the lowest address of the range is aligned to 4-KiB. The memory-mapped registers may be accessed using naturally aligned 4-byte or 8-byte memory accesses. The controller behavior for register accesses where the address is not aligned to the size of the access, or if the access spans multiple registers, of if the size of the access is not 4-bytes or 8-bytes, is UNSPECIFIED. The atomicity of access to an 8-byte register is UNSPECIFIED. A 4-byte access to a register must be single-copy atomic.

The controller registers have little-endian byte order (even for systems where all harts are big-endian-only).

Note

Big-endian-configured harts that make use of the register interface may implement the REV8 byte-reversal instruction defined by the Zbb extension. If REV8 is not implemented, then endianness conversion may be implemented using a sequence of instructions.

A controller may support a subset of capabilities defined by CBQRI. When a capability is not supported the registers and/or fields used to configure and/or control such capabilities are hardwired to 0. Each controller supports a capabilities register to enumerate the supported capabilities.