-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathchap-cheri-riscv.tex
1220 lines (1053 loc) · 56.5 KB
/
chap-cheri-riscv.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{The CHERI-RISC-V Instruction-Set Architecture}
\label{chap:cheri-riscv}
\newcommand{\riscvloadcappagefault}{0x1A}
\newcommand{\riscvstorecappagefault}{0x1B}
\newcommand{\riscvcheriexception}{0x1C}
Having considered the software-facing semantics and architecture-neutral
aspects of the CHERI protection model in previous chapters, we now turn to
elaborating CHERI capabilities within a specific architecture: 32-bit
and 64-bit RISC-V.
Wherever possible, CHERI-RISC-V implements the architecture-neutral concepts
described in Chapter~\ref{chap:architecture}.
Detailed descriptions of specific capability-aware instructions can be found
in Chapter~\ref{chap:isaref-riscv}.
\section{The RISC-V Instruction-Set Architecture}
RISC-V is a contemporary open-source architecture developed at the University
of California at Berkeley.
RISC-V is intended to be used with a range of microprocessors spanning small
32-bit microcontrollers intended for embedded applications to larger 64-bit
superscalar processors intended for use in datacenter computing.
The RISC-V ISA is reminiscent of MIPS, with some important differences: a more
modular design allows the ISA to be more easily subsetted and extended; a
variable-length instruction encoding improves code density; the MMU has a
hardware page-table walker rather than relying on software TLB management;
the ISA avoids exposing pipelining behaviors to software (e.g., there is no
branch-delay slot); and it has a more contemporary approach to atomic memory
instructions.
Various drafts and standardized extensions add other more contemporary
features such as hypervisor support. There is also ongoing work to define
broader platform behaviors beyond the architecture, including platform
self-description and peripheral-device enumeration.
\section{CHERI-RISC-V Approach}
Our application of CHERI to the RISC-V architecture is motivated by several
opportunities:
\begin{itemize}
\item To gain access to a maturing open-source ISA, hardware, and software
ecosystem, for the purposes of a stronger experimental baseline and
methodology (such as more mature core variants).
\item To demonstrate the portability of the CHERI approach across multiple
architectures, and in particular to illustrate how portable CHERI software
stacks can be designed and maintained despite underlying architectural
differences.
\item To apply lessons learned from CHERI-MIPS in an entirely fresh
application of the protection model to a new architecture.
Many of our MIPS design choices reflected pragmatic design choices made prior
to the development of full compiler and operating-system stacks, and were
difficult to change within those stacks.
\item To revisit and scientifically explore a design space around CHERI
integration into a target architecture -- for example, around the use of
register files and exceptions.
\item To support new CHERI experimentation in the space of microcontrollers,
heterogenous cores and accelerators, and DMA, as well as in relation to
microarchitectural side channels.
\item To lay groundwork for possible open-source transition of the CHERI
protection model into the RISC-V architecture.
\end{itemize}
In the following subsections, we describe our high-level approach before
providing a more detailed specification of CHERI-RISC-V.
\subsection{Target RISC-V ISA Variants}
The RISC-V ISA defines both 32-bit (\texttt{XLEN}=32) and 64-bit
(\texttt{XLEN}=64) base integer instruction
sets (RV32I, RV64I).
Our current proposal supports either mode with few differences beyond
capability width, although safe support for both modes in a single processor
is not specified at this time.
\pgnnote{This may be understated. It may also be misinterpreted,
as to whether what exists is not safe, or not done at all.
That seems ambiguous. The RISC-V privileged spec itself
is mostly parameterized for XLEN (Prashanth), and I would presume
perhaps the CHERI-RISC-V might also, as it seems is described below.}
\pdrnote{I think adding the word dynamically clears up this confusion?}
Our definition of CHERI-RISC-V should work with either 32-register or
16-register (RV32E) variants of RISC-V.
We specify CHERI as applied to RVG, which consists of the general-purpose
elements of the RISC-V ISA: integer, multiplication and division,
atomic, floating-point, and double floating-point instructions.
We also describe extensions to the supervisor-level and machine-level ISAs
defined in the
privileged portion of the ISA.
We view 64-bit CHERI-RISC-V as a mature specification suitable as a
starting point for an official RISC-V extension. However, we feel
that 32-bit CHERI-RISC-V is less mature. In particular, the current
encoding for 64-bit capabilities provides insufficient precision.
Further research is needed to determine if an alternate encoding,
perhaps using an alternate scheme for permissions, can provide better
precision.
\subsection{CHERI-RISC-V Strategy}
Wherever possible, we attempt to conform to the specific aesthetic of RISC-V,
such as with respect to opcode layout choices and aligning the semantics of
new Special Capability Register access instructions with existing RISC-V CSRs.
\pgnnote{This begs the question of whether we will remain fully compliant
with the RISC-V privileged spec, or must necessarily deviate.}
\subsection{Common Architectural Features}
CHERI-RISC-V shares the following features with other CHERI architectures:
\begin{itemize}
\item Tagged memory with capability-width tag granularity and alignment.
\item Registers able to hold capabilities are tagged.
\item \PCC{} constrains program-counter-relative fetches.
\item \DDC{} constrains legacy RISC-V load-store instructions.
\item Floating point is fully supported, including capability-relative
floating-point load and store instructions.
\item General-purpose registers are extended to hold capabilities.
\item Capability-related violations (such as loads/stores/fetches via untagged
capabilities, out-of-bound accesses, and so on) trigger immediate precise
exceptions.
\item Requests for non-monotonic capability transformations result in
the tag of the written back value being stripped.
\item It is never left ambiguous as to whether a register index operand to a
load or store instruction, or the register target of a jump instruction,
is a capability and therefore must have a tag set.
This reinforces intentionality.
\item \cappermASR limits privileged ISA
operations when within privileged rings.
While RISC-V's specific privileged operations differ, the intent remains the
same: to allow code compartmentalization within the privileged ring.
\end{itemize}
\subsection{Unique Architectural Features}
The following changes are specific to CHERI-RISC-V:
\begin{itemize}
\item RISC-V exception handling -- including register banking, scratch
registers, and cause mechanism -- is used.
\item A new exception code, \riscvloadcappagefault{}, is
reported in the RISC-V \xcause{} CSRs when a load attempts to fetch a
capability through a valid page table entry granting read permission but
forbidding loads of capabilities. This fault otherwise behaves like a RISC-V
load page fault.
\item A new exception code, \riscvstorecappagefault{}, is
reported in the RISC-V \xcause{} CSRs when a store attempts to write a
capability through a valid page table entry granting write permission but
forbidding stores of capabilities. This fault otherwise behaves like a
RISC-V store/AMO page fault.
\item A new exception code, \riscvcheriexception{}, is
reported in the RISC-V \xcause{} CSRs when other
capability-related exceptions (such as tag violations) occur.
%
Additional capability-specific exception cause information, such
as more specific cause information and the identity of the faulting
register is reported in the existing \xtval{} CSRs (see
Section~\ref{subsection:riscv:cheri-exception-reporting}).
\item A new bit is defined in \menvcfg{} and \senvcfg{} to enable
CHERI support.
\item New per-mode capability CSRs are added as \xccsr{} (see
Section~\ref{subsubsec-ccsrs}).
\item CHERI-related page permissions are added to RISC-V architectural
page-table formats.
\item The interpretation of addresses in memory capabilities
depends on whether virtual addressing is enabled via the RISC-V
\texttt{satp} CSR\footnote{This is not a substantially different design
choice than in other architectures: memory
capabilities are interpreted relative to the active address space, and
control of that address space is delegated to suitably privileged code,
whether configuring a simple direct map between virtual and physical memory,
or managing multiple more complex address spaces.
In all cases, care is required as physical-memory access authorized by a
capability is determined by the addressing mode and current translation
table contents.}.
When \texttt{satp} is set to \texttt{Bare}, capabilities have a
physical-address interpretation.
When \texttt{satp} enables page-table translation, capabilities have a
virtual-address interpretation.
\item Both XLEN=32 and XLEN=64 are supported (albeit not dynamically).
In the future, it may be desirable to also support XLEN=128.
\item A rich set of atomic instructions is extended with capability
support.
\item The \cflags{} field contains a single bit indicating the ``capability
encoding mode'' to use when the capability is installed as \PCC{}.
\item In the non-compressed RISC-V encoding, the capability encoding mode
allows existing opcodes, e.g.\ for loads, stores, \insnnoref{AUIPC},
and jumps
to be interpreted as expecting capability rather than integer operands
(reducing opcode footprint while maintaining intentionality).
\item In the compressed RISC-V encoding, the capability encoding mode allows
existing load, store, jump, and stack addressing opcodes to be interpreted as expecting
capability rather than integer operands.
\end{itemize}
\section{CHERI-RISC-V Specification}
In this section, we describe in greater detail the integration of CHERI into
the RISC-V instruction set.
Instruction opcode encodings can be found in
Appendix~\ref{app:isaquick-riscv}.
\subsection{CHERI as a non-standard RISC-V extension}
CHERI is integrated into the RISC-V ISA as a non-standard extension
named Xcheri, and follows the idioms for RISC-V extensions to the
extent possible. In the extension terminology of the RISC-V
specification, CHERI is mostly a \emph{greenfield} extension since it adds
most new instructions by populating a new instruction encoding space. The
prefix used for the encoding is currently ``1011011'', placing it in
the \emph{custom-2/rv128} opcode space that the specification allows
for use for custom instruction set extensions on RV64; this makes it a
standard-compatible global encoding. However, we also propose a few
new instructions in existing encoding ranges. The new instructions to
load and store capabilities are \emph{brownfield} extensions to the
LOAD and STORE opcodes in the base integer ISAs. In addition, CHERI
adds new atomic operation instructions which are \emph{brownfield}
extensions to the AMO opcode.
A CHERI-RISC-V processor has the X bit of the \texttt{misa} register
hardwired to 1 on boot to indicate the presence of a non-standard
extension. Information tying this set X bit to the Xcheri extension
would be communicated to system software in a platform-specific manner.
CHERI-RISC-V is currently defined as a non-standard extension to
version 2.2 of the RISC-V userspace ISA~\cite{RISCV:User:2.2} and
version 1.11 of the RISC-V privileged ISA~\cite{RISCV:Privileged:1.11}.
\subsection{Tagged Capabilities and Memory}
CHERI-RISC-V allows both registers and memory to hold tagged capabilities,
allowing capabilities and data to be intermingled.
This allows capabilities to be embedded within in-memory data structures,
supports the implementation of capability-oblivious memory copy operations,
and maintains strong C-language pointer compatibility for capabilities.
This implies the use of tagged memory consisting of 1-bit
tags protecting capability-aligned, capability-sized words of memory
implemented with suitable protection and atomicity properties.
While we currently do not define CHERI-RISC-V support for RV128, we anticipate
that we will wish to support RV128 in the future.
It seems plausible that 256-bit capabilities might incorporate 128-bit
addresses along with compressed bounds in a similar manner to our 128-bit
capabilities for 64-bit addresses.
\subsection{Capability Register File}
In CHERI-RISC-V,
general-purpose integer registers are extended to optionally hold
full
capabilities, along with a tag.
Extending general-purpose integer registers raises the
question of whether and how capability-unaware instructions should
interact with capability values in registers -- a concern not dissimilar to
the behavior of instructions on 64-bit architectures offering legacy 32-bit
support.
We specify that individual instructions reading from, or writing to, a
register in the register file have fixed integer or capability interpretations
based on the opcode encoding -- i.e., that new instructions be introduced that
explicitly specify whether capability semantics are required for an input or
output register, or that the current architectural mode unambiguously specify
integer or capability operand interpretation.
The bottom \texttt{XLEN} bits of the register contain the integer
interpretation (which, for a capability, will be its address\pdrnote{Does this also
cover capabilities which authorise type space rather than address space?}), and the
top \texttt{XLEN} bits (plus additional tag bit) contain any capability
metadata.
When a register is read as an integer (i.e., using an opcode that dictates an
integer interpretation),
the register's bottom \texttt{XLEN} bits will be utilized, and any other bits ignored.
When a register is written as an integer, the new integer value is
stored in its bottom \texttt{XLEN} bits,
and the top \texttt{XLEN} bits and tag bit are cleared to match
those of the NULL capability. This both prevents in-register corruption of tagged
capabilities by implicitly clearing the tag, and also provides reasonable semantics
for integer access to capability values.
\subsubsection{Capability Length Architectural Constant (CLEN)}
One challenge in introducing CHERI support is that the architectural constant,
\texttt{XLEN}, the number of bits in a register, is used to define numerous
behaviors throughout the ISA, such as the size of CSRs, the operation of
integer operations, the size of addresses, and so on.
We choose to leave \texttt{XLEN} as constant as the majority of these operations
are intended to be of the natural integer size (e.g., for addition).
However, this does mean that in some cases we need to introduce new
instructions intended to operate on full capability-wide values.
We introduce a new architectural constant, \texttt{CLEN}, which we define as
$2\times$\texttt{XLEN}, which excludes the tag bit.
Operations such as capability-width CSR access, capability load, and capability
store will operate on \texttt{CLEN}$+1$ bits including the tag bit.
Specifically, for 32-bit CHERI-RISC-V, \texttt{CLEN} is 64 bits, and for
64-bit CHERI-RISC-V, \texttt{CLEN} is 128 bits, affecting a variety of
functions including the stride of tag bits in physical memory.
Opcode space is reserved in the RISC-V ISA for 64-bit load and store
instructions even when \texttt{XLEN} is 32, and we reuse these opcode
reservations and encodings to load 64-bit \texttt{CLEN} words as well as
their tag bit.
Similarly, when \texttt{XLEN} is 64, we use the anticipated 128-bit
\texttt{CLEN} load and store opcodes.
We do not currently define support for 32-bit compatibility (with or without
capability support) when operating in a 64-bit RISC-V processor, but
anticipate that adding non-capability-aware 32-bit support would be
straightforward.
We also do not yet define an architecture supporting multiple capability
widths concurrently, but recognize that there are certain use cases -- such as
when interoperating between a 64-bit application core and a 32-bit
microcontroller within a single System-on-Chip (SoC) -- where this would be
valuable.
\subsection{Capability-Aware Instructions}
In CHERI-RISC-V, two general categories of instructions are added: those that
query or manipulate capability fields within registers, and those that
utilize capability registers for the purposes of load, store, or jump operations.
Register-to-register instructions querying and manipulating fields allow integer values to be moved in and
out of portions of an in-register capability, subject to guarded manipulation.
They are simply new instructions defined in CHERI-RISC-V and added to
the opcode space.
It is possible to imagine having memory-access and
control-flow instructions condition their behavior based on the presence of a
tag, selecting a compatible integer behavior if the tag is not set, and a
capability behavior if it is set.
However, this would violate the principle of intentional use: not only should
privilege be minimized, but it should not be unintentionally, implicitly, or
ambiguously exercised.
Allowing a corrupted capability (i.e., one with its tag stripped due to an
overlapping data write) to dereference \DDC{} implicitly would violate this
design goal.
We therefore specify strong \textit{type safety} for all capability-aware
instructions: all instructions explicitly encode whether an integer or
capability operand is being used, and attempts to use untagged values where
tagged ones are expected will lead to an exception.
\subsection{Control and Status Registers (CSRs)}
\label{subsection:cheri-riscv-csrs}
CHERI-RISC-V extends the behavior of the baseline RISC-V integer CSR set,
allowing capability control over access to some CSRs for compartmentalization
purposes, as well as adding several new CSRs to control capability-related
functionality.
These are accessed via existing RISC-V CSR instructions, and their encodings
are given in Table~\ref{tab:risc-v-control-and-status-registers}.
New Special Capability Registers (SCRs), accessed via new CSR-like
instructions, are described in Section~\ref{subsection:cheri-riscv-scrs}.
\begin{table}[h]
\centering
\begin{tabular}{c>{\raggedright\arraybackslash}p{2.7in}>{\raggedright\arraybackslash}p{2.5in}}
\toprule
\textbf{Encoding} & \textbf{Register} & Privilege notes \\
\midrule
\textbf{0x8C0} & User capability control and status register (\uccsr{}) & \PCC{}.\cperms{}.\emph{Access\_System\_Registers} \\
\textbf{0x9C0} & Supervisor capability control and status register (\sccsr{}) & \{S,M\}-mode \& \PCC{}.\cperms{}.\emph{Access\_System\_Registers} \\
\textbf{0xBC0} & Machine capability control and status register (\mccsr{}) & M-mode \& \PCC{}.\cperms{}.\emph{Access\_System\_Registers} \\
\bottomrule
\end{tabular}
\caption{Control and Status Registers (CSRs)}
\label{tab:risc-v-control-and-status-registers}
\end{table}
\subsubsection{Controlling Access to CSRs}
Accessing some RISC-V CSRs also requires the \PCC{}.\cperms{}.\emph{Access\_System\_Registers}
permission to be set for the currently executing code.
This allows privileged-level code to be constrained from interfering with key
system management functionality (such as exception handling).
We adopt a whitelist approach: reading or writing any CSR requires the permission, with the exceptions listed in Table~\ref{tab:risc-v-access-system-registers-whitelist}.
\pdrnote{Text describing current makeshift whitelist, pending updates. TODO we want three separate permissions: UASR, SASR, MASR, permitting access only to the
corresponding privilege mode's CSRs. Debate still open over whether these are encoded as 3 bits, or as a 2-bit counter that monotonicity can only decrease
(make less privileged). This could either be a separate instruction or implicitly enforced on CAndPerms. We likely still want some kind of whitelist after this change}
\ajnote{I am not sure I am fully happy with that, but probably just need to be convinced by someone that this is the way to go... It feels weird that a user task which used to just access, say, the instret or time csr suddenly needs Access\_System\_Registers on its PCC.}
\pmnote{User access to the instret and time csrs are already constrained by scounteren.{IR,TM}; i.e. the user task will suffer an exception if these are not enabled by the supervisor. Access\_System\_Registers on its PCC would be similar.}
\begin{table}[h!]
\centering
\begin{tabular}{cc}
\toprule
\textbf{CSR} & \textbf{Read/Write} \\
\texttt{cycle(h)} & Read-Only \\
\texttt{time(h)} & Read-Only \\
\texttt{instret(h)} & Read-Only \\
\texttt{hmpcounter(h)} & Read-Only \\
[1.5em]
\texttt{fflags} & Read-Write \\
\texttt{frm} & Read-Write \\
\texttt{fcsr} & Read-Write \\
\bottomrule
\end{tabular}
\caption{CSR Whitelist. The accesses shown are the only CSR accesses that are permitted when the installed PCC does not have \cappermASR{}.}
\label{tab:risc-v-access-system-registers-whitelist}
\end{table}
\subsubsection{CHERI Extension Control}
A new bit in the \menvcfg{} and \senvcfg{} CSRs enables
CHERI for lower privilege levels. When CHERI is disabled, attempting
to execute CHERI-specific instructions raises an illegal
instruction fault, including loads and stores which use a capability
register (excluding the implicit \DDC{} operand for legacy
loads/stores) as the memory operand.
Other CHERI extensions are always enabled regardless of the state of
this bit. Specifically, bounds and permissions on \PCC{} and \DDC{}
are always honored. Exceptions always copy \PCC{} to \xEPCC{} on
exception entry and restore the full \PCC{} on exception return.
Capability mode is always honored if enabled in \PCC{}. Software
which disables CHERI in lower modes must take care to ensure that
\PCC{} and \DDC{} are set to suitable values while lower modes
execute.
Bit 28 (\texttt{0x1C}) in the \menvcfg{} and \senvcfg{} CSRs is
defined as the CHERI enable bit. Its allocation within these CSRs may
change until CHERI is ratified as a RISC-V extension.
\subsubsection{Capability Control and Status Registers (CCSRs)}
\label{subsubsec-ccsrs}
New per HART \xccsr{} \texttt{XLEN}-bit RISC-V CSRs are defined as per
Figure~\ref{fig-ccsr} (shown for XLEN=32):
\begin{figure}[!h]
\begin{center}
\begin{bytefield}[bitwidth=\textwidth/34]{32}
\bitheader[endianness=big]{0,30,31} \\
\bitbox{1}{\texttt{tc}}
\bitbox{1}{\texttt{nr}}
\bitbox{30}{\textbf{WPRI}}
\end{bytefield}
\caption{\xccsr{} register format; WPRI bits are Write Preserve Read Ignore.}
\label{fig-ccsr}
\end{center}
\end{figure}
\begin{description}
\item [nr] The \texttt{nr} ``no relocation'' read-only bit indicates
if integer memory addresses are relocated by the base address of
\DDC{} and \PCC{}. If this bit is 1, then integer addresses are not
relocated.
\item [tc] The \texttt{tc} ``tag-clearing'' read-only bit indicates if
attempts to update a capability non-monotonically clear the
resulting capability's tag (1) rather than raising a
\riscvcheriexception{} exception (0).
\end{description}
An implementation compliant with the current version of this
specification sets the \texttt{nr} and \texttt{tc} bits to 1 and all
other bits to 0 at reset.
\subsection{Special Capability Registers (SCRs)}
\label{subsection:cheri-riscv-scrs}
Special Capability Registers (SCRs) are similar to CSRs in that they affect
special functions such as exception delivery, rather than being
general-purpose registers, but have capability rather than integer types.
SCRs are therefore accessed via new capability-aware instructions.
The new \asm{CSpecialRW} instruction allows reading and writing special
capability registers. When the destination is the zero register, the instruction shall
not read the special capability register and shall not cause any of the
side-effects that might occur on a special capability register read, similar to
the standard \asm{csrrw} RISC-V instruction. When the source is the zero register, the
instruction will not write to the special capability register at all, and so
shall not cause any of the side effects that might otherwise occur on a special
capability register write, similarly to the standard \asm{csrrs/c} RISC-V
instruction.
Table~\ref{tab:risc-v-special-capability-registers} lists the SCRs
available via that instruction, as well as their values at CPU reset, which
will be set in a manner consistent with the description in
Section~\ref{sec:capability-state-on-cpu-reset}.
Whether a register is initialized to NULL or the omnipotent capability, its
flags field is initialized to zero (specifying integer encoding mode).
\pdrnote{I have added MTDC, UTDC, STDC to match exception handling section.
Might be worth further discussion to decide if we need/want them.}
\begin{table}[h!]
\centering
\begin{tabular}{cllcccc@{}}
\toprule
& \textbf{Register} & \textbf{Modes} & \textbf{Access} & \textbf{Reset} & \textbf{Extends} \\ \midrule
\textbf{0} & Program counter capability (\PCC{}) & U, S, M & RO & $\infty$ & \PC{} \\
\textbf{1} & Default data capability (\DDC{}) & U, S, M & - & $\infty$ & - \\
[1.5em]
\textbf{4} & User trap code capability (\UTCC{}) & U, S, M & ASR & $\infty$ & \utvec{} \\
\textbf{5} & User trap data capability (\UTDC{}) & U, S, M & ASR & $\emptyset$ & - \\
\textbf{6} & User scratch capability (\UScratchC{}) & U, S, M & ASR & $\emptyset$ & \uscratch{} \\
\textbf{7} & User exception PC capability (\UEPCC{}) & U, S, M & ASR & $\infty$ & \uepc{} \\
[1.5em]
\textbf{12} & Supervisor trap code capability (\STCC{}) & S, M & ASR & $\infty$ & \stvec{} \\
\textbf{13} & Supervisor trap data capability (\STDC{}) & S, M & ASR & $\emptyset$ & - \\
\textbf{14} & Supervisor scratch capability (\SScratchC{}) & S, M & ASR & $\emptyset$ & \sscratch{} \\
\textbf{15} & Supervisor exception PC capability (\SEPCC{}) & S, M & ASR & $\infty$ & \sepc{} \\
[1.5em]
\textbf{28} & Machine trap code capability (\MTCC{}) & M & ASR & $\infty$ & \mtvec{} \\
\textbf{29} & Machine trap data capability (\MTDC{}) & M & ASR & $\emptyset$ & - \\
\textbf{30} & Machine scratch capability (\MScratchC{}) & M & ASR & $\emptyset$ & \mscratch{} \\
\textbf{31} & Machine exception PC capability (\MEPCC{}) & M & ASR & $\infty$ & \mepc{} \\
\bottomrule
\end{tabular}
\caption{Special Capability Registers (SCRs).
SCRs 4-7 are available only with the N extension, and 12-15 only with
supervisor mode.
\textbf{Modes} shows which RISC-V privilege modes are allowed to access the
registers.
\textbf{Access} indicates additional restrictions on accessing the registers:
\PCC{} is read-only via \insnriscvref{CSpecialRW}, but is set by
\insnriscvref{CJALR} and during exceptions; \textit{ASR} indicates
\PCC{}.\cperms{} must grant \cappermASR{} to permit access (in addition to
being in a permitted mode).
\textbf{Reset} indicates whether the register is initialised to the default
root capability ($\infty$) or NULL capability ($\emptyset$) on reset.
Some special capabilities registers are extensions of existing RISC-V
registers, with the capability address being equal to the original register.
\note{We should describe this in more detail including behavior if they are
sealed or become unrepresentable and what to do about PC alignment. Note this
table shares quite a lot with \cref{subsection:riscv:exceptionhandling}}{rmn30}
}
\label{tab:risc-v-special-capability-registers}
\end{table}
Where an SCR extends a RISC-V CSR, e.g.\ \MTCC{} extending \mtvec{},
any read to the CSR shall return the address of the corresponding SCR.
Similarly, any write to the CSR shall set the address of the SCR to the value
written.
This shall be equivalent to a \insnriscvref{CSetAddr} instruction.
This allows sealed capabilities to be held in SCRs without allowing them to
be modified in a tag-preserving way.
Some RISC-V CSRs have write ignore bits, or otherwise implicitly modify
the written value to restrict the CSR to legal values.
These modifications must be applied to the SCR's new address when writing a CSR
extended by an SCR, or to the address of the newly written capability when
using \insnriscvref{CSpecialRW}.
\insnriscvref{CSpecialRW} of a sealed capability to an SCR which extends a CSR
with any non-preserved bits clears the tag on the capability, even if the
address would not be changed.
As per the rest of the RISC-V specification, should the SCR become
unrepresentable as a result of the address being set, the tag is cleared
but the resulting address and the encoded capability metadata are preserved.
\subsection{Capability Encoding Mode}
\label{sec:cheri-riscv-capmode}
RISC-V instructions that interpret arguments or results as addresses
(e.g.\ loads, stores, jumps, \insnnoref{AUIPC}) can either act on integer pointers
or on explicit capabilities.
For example, capability-relative load and store instructions accept (and expect) capability
operands that constrain data accesses, performing tag, bounds,
permission, and other checks as required.
However, load and store instructions occupy large amounts of instruction
encoding space due to having multiple register operands and large immediate
values.
To avoid occupying large chunks of remaining encoding space by
supplementing each address-manipulating instruction with a
corresponding capability-relative version,
we introduce a new \textit{capability encoding mode} in which
some existing RISC-V opcodes are reused for capability-relative
accesses.
The encoding mode is selected using the CHERI-RISC-V-specific
encoding-mode flag in the capability \cflags{} field of \PCC{}:
\begin{description}
\item[Integer encoding mode (0)] Conventional RISC-V execution mode, in which
address operands to existing RISC-V load, store, jump, and \insnnoref{AUIPC} opcodes contain
\textit{integer addresses}.
The upper \texttt{XLEN} bits and tag bit of
the operand register are ignored.
For loads and stores, the
tag bit on \DDC{} must indicate that a valid capability is present, and
all capability-related checks (such as bounds checks) must be performed in
order for a successful load or store to take place.
\item[Capability encoding mode (1)] CHERI capability encoding mode, in which address operands to
existing RISC-V load, store, jump, and \insnref{AUIPCC} opcodes contain \textit{capabilities}.
For loads and stores, the tag bit must indicate a valid capability is present, and all
capability-related checks (such as bounds checks) must be performed in order
for a successful load or store to take place.
\end{description}
To maintain intentionality, this approach is never ambiguous in either mode
as to whether memory accesses are relative to an
integer or capability operand: address operands of existing RISC-V
opcodes are always integer relative
in integer encoding mode, and always capability relative in capability
encoding mode.
The operating system will automatically save and restore \PCC{} on context
switches, preserving an execution context's encoding mode.
It is essential that changes in encoding mode be properly observed when an
exception is processed, as the exception handler must execute with expected
semantics or risk insecure behavior.
When \xTCC{} is set by the operating system, it should contain an appropriate
encoding-mode flag to ensure that exception handlers utilize the correct
instruction encoding.
In addition, a small set of both capability-relative and
integer-relative loads, stores, and jumps are added, tuned to limit opcode
space utilization -- e.g., by having small or no immediates -- at the cost
of increased code footprint. These instructions are available in both
encoding modes to permit alternate memory accesses.
Pure-capability and hybrid code can be generated against either encoding,
but will be most efficient (in terms of instruction footprint) when
generated against the corresponding mode.
\subsubsection{Non-Compressed Instructions Affected by Capability Encoding
Mode}
The following non-compressed RISC-V instructions are
affected by the capability encoding-mode bit (see the following section for
further details on compressed instructions):
\medskip
\begin{savenotes}
\begin{tabular}{llllll}
\textit{Integer load} & LB & LH & LW & LD & LQ \\
\textit{Integer load (unsigned)} & LBU & LHU & LWU & LDU & \\
\textit{Integer store} & SB & SH & SW & SD & SQ \\
\textit{Floating-point load} & FLW & FLD & FLQ & & \\
\textit{Floating-point store} & FSW & FSD & FSQ & & \\
\textit{Atomic} & LR & SC & AMOSWAP & AMOADD & AMOAND \\
\textit{Atomic (cont)} & AMOOR & AMOXOR & AMOMAX & AMOMIN & \\
\textit{Control flow} & JAL & JALR & & & \\
\textit{Address calculation} & AUIPC\footnote{See Section~\ref{section:cheri-risc-v-auipc}.} & & & & \\
\end{tabular}
\end{savenotes}
\subsection{Compressed Instructions}
\label{subsection:compressed-instructions}
While the compressed instruction extension is not mandatory for RV32G
and RV64G, it is widely used in existing RISC-V software to improve
code density.
Given the tight encoding space for compressed instructions, it is not
practical to support both integer and capability variations of common
instructions. Instead, the encoding mode must be used to alter the
interpretation of existing instructions to support optimal code
density.
Two problems arise in adding compressed instruction support for capabilities:
determining which existing instructions should use capability semantics
and the need to add new instructions to load and store capabilities.
For some instructions, the choice to use capability semantics in
capability encoding mode is straightforward. Compressed loads and
stores should use capability-relative addresses just as for
non-compressed instructions. Similarly, compressed jump instructions
should use capability registers for jump target registers and link
registers. Finally, instructions which use the stack pointer as an
implicit operand should use \CSP{} as the stack pointer. This
includes loads and stores as well as the stack addressing instructions
\insnnoref{C.ADD16SP} and \insnnoref{C.ADDI4SPN}.
For other instructions, the decision is less clear. Pure capability
code will use both \insnnoref{MV} and \insnref{CMove} instructions as
well as both \insnnoref{ADDI} and \insnref{CIncOffsetImm}. Our current
approach for these types of instructions has been to not alter their
semantics in capability encoding mode until more research can be done
to determine the most common semantics in pure capability code.
Adding compressed instructions to load and store capabilities requires
repurposing some existing opcodes. For this case we follow the
pattern used by RV64C and RV128C of repurposing compressed floating
point loads and stores to load and store capabilities. Just as RV64C
reuses the floating-point single-precision loads and stores
(e.g. \insnnoref{C.FLW} and \insnnoref{C.FSW}) for 64-bit integer
loads and stores (\insnnoref{C.LD} and \insnnoref{C.SD}), compressed
CHERI-RV32 reuses these opcodes in capability-encoding mode for
capability loads and stores (\insnnoref{C.CLC} and \insnnoref{C.CSC}).
Similarly, CHERI-RV64 reuses the opcodes for floating-point double
loads and stores (\insnnoref{C.FLD} and \insnnoref{C.FSD}) in
capability-encoding mode for capability loads and stores. The same
rules apply to stack-relative memory access instructions.
When a compressed instruction in capability encoding mode encodes a
capability register operand using a 3-bit field rather than a 4-bit
field, the selected capability register is one of
\texttt{c8}--\texttt{c15} using the same mapping used for
\texttt{x8}--\texttt{x15} in integer encoding mode.
\subsubsection{Compressed Instructions Affected by Capability Encoding Mode}
The following compressed instructions are affected by capability encoding
mode:
\medskip
\begin{tabular}{llllll}
\textit{Stack addressing} & C.ADDI4SPN & C.ADDI16SP & \\
\textit{Control flow} & C.JAL & C.JALR & C.JR & \\
\textit{Compressed integer load} & C.LW & C.LD & C.LWSP & C.LDSP & \\
\textit{Compressed integer store} & C.SW & C.SD & C.SWSP & C.SDSP & \\
\textit{Compressed floating-point load} & C.FLW & C.FLD & C.FLWSP & C.FLDSP & \\
\textit{Compressed floating-point store} & C.FSW & C.FSD & C.FSWSP & C.FSDSP & \\
\end{tabular}
\subsection{Floating Point}
The vast majority of floating-point instructions are not impacted by the
presence of CHERI-RISC-V.
Existing RISC-V floating-point load and store instructions use
capability-relative addresses in capability encoding mode, and
integer-relative addresses constrained by \DDC{} in integer encoding mode.
The floating point control
registers (\texttt{fcsr}, \texttt{frm}, and \texttt{fflags}) are whitelisted in Table \ref{tab:risc-v-access-system-registers-whitelist}
so they can be accessed without needing \cappermASR{}.
\subsection{Exception Handling}
\label{subsection:riscv:exceptionhandling}
RISC-V defines several privilege modes, including machine mode, user mode, and
supervisor mode, with exceptions allowing controlled transition between those modes.
CHERI-RISC-V introduces several new exception-related Special Capability Registers
to supplement existing RISC-V exception CSRs with new capability-related functionality.
In addition, when a capability exception is raised, \xtval{} provides
details about the exception as described in Section~\ref{subsection:riscv:cheri-exception-reporting}.
\subsubsection{Exceptions to Machine Mode}
We define the following new special capability registers that can be read and
written only from machine mode:
\begin{itemize}
\item \MEPCC{} - Machine Mode Exception Program Counter Capability (extends
\mepc{})
\item \MTDC{} - Machine Mode Data Capability
\item \MTCC{} - Machine Mode Trap Code Capability (extends \mtvec{})
\item \MScratchC{} - Machine Mode Scratch Capability (extends
\mscratch{})
\end{itemize}
\subsubsection{Exceptions to Supervisor Mode}
We define the following new special capability registers that can be read and
written only from supervisor mode and above:
\begin{itemize}
\item \SEPCC{} - Supervisor Mode Exception Program Counter Capability (extends
\sepc{})
\item \STDC{} - Supervisor Mode Data Capability
\item \STCC{} - Supervisor Mode Trap Code Capability (extends
\stvec{})
\item \SScratchC{} - Supervisor Mode Scratch Capability (extends
\sscratch{})
\end{itemize}
\subsubsection{Exceptions to User Mode}
When present, we extend the ``N'' extension (for ``User-Level Interrupts'')
with the following
new special capability registers that can be read and written from any mode:
\begin{itemize}
\item \UEPCC{} - User Mode Exception Program Counter Capability (extends
\uepc{})
\item \UTDC{} - User Mode Data Capability
\item \UTCC{} - User Mode Trap Code Capability (extends \utvec{})
\item \UScratchC{} - User Mode Scratch Capability (extends
\uscratch{})
\end{itemize}
This extension could be leveraged for user-space-only implementations
of \insnriscvref{CInvoke}, as well as routing specific interrupts from
suitable devices to user-level compartments for handling by sandboxed
device drivers.
Explicit vector and data capabilities give each ring its
own code and data capabilities to utilize during exception handling.
We extend the existing RISC-V \xscratch{} registers as capabilities
to allow the exception handler to stash a
capability register for the purposes of having a working register that
corresponding data capabilities can be loaded to in order to begin a full
context save.
When exception behavior, e.g.\ a trapping instruction, \insnnoref{ecall},
or \xRET{}, causes \PCC{} to take a value stored in an SCR, it is possible that
the SCR contains a capability that would not be a valid \PCC{} (untagged,
sealed, not executable, or improperly aligned).
In these cases, the value is still installed in \PCC{}, and a check on the next
instruction fetch triggers a further exception.
\subsection{Capability Exception Reporting}
\label{subsection:riscv:cheri-exception-reporting}
CHERI-RISC-V extends the definition of the Trap Value CSRs, \xtval{}, to
report capability exception details as described in
Figure~\ref{fig-cheri-tval} (shown for XLEN=32):
\begin{figure}[!h]
\begin{center}
\begin{bytefield}[bitwidth=\textwidth/34]{32}
\bitheader[endianness=big]{0,4,5,10,31} \\
\bitbox{21}{\textbf{WPRI}}
\bitbox{6}{\texttt{cap idx}}
\bitbox{5}{\texttt{cause}}
\end{bytefield}
\caption{\xtval{} register format for Capability Exception}
\label{fig-cheri-tval}
\end{center}
\end{figure}
\begin{description}
\item [cause] The \texttt{cause} field reports the capability
exception code from Table~\ref{tab:risc-v-capability-cause}.
\item [cap idx] The \texttt{cap idx} field reports the index of the capability register that caused the last exception. When
the most significant bit is set, the 5 least significant bits are used to index
the special purpose capability register file described in
Table~\ref{tab:risc-v-special-capability-registers}, otherwise, they index the
general-purpose capability register file.
\end{description}
\begin{table}
\begin{center}
\begin{tabular}{ll}
\toprule
Value & Description \\
\midrule
0x00 & None \\
0x01 & Length Violation \\
0x02 & Tag Violation \\
0x03 & Seal Violation \\
0x04 & Type Violation \\
0x05-0x07 & \emph{reserved} \\
0x08 & Software-defined Permission Violation \\
0x09-0x0f & \emph{reserved} \\
0x10 & \cappermG Violation \\
0x11 & \cappermX Violation \\
0x12 & \cappermL Violation \\
0x13 & \cappermS Violation \\
0x14 & \cappermLC Violation \\
0x15 & \cappermSC Violation \\
0x16 & \cappermSLC Violation \\
0x17 & \emph{reserved} \\
0x18 & \cappermASR Violation \\
0x19 & \cappermInvoke Violation \\
0x1a-0x1b & \emph{reserved} \\
0x1c & \cappermCid Violation \\
0x1d-0x1f & \emph{reserved} \\
\bottomrule
\end{tabular}
\end{center}
\caption{CHERI-RISC-V Capability Exception Codes}
\label{tab:risc-v-capability-cause}
\end{table}
\jhbnote{The current exception code values are inherited from
CHERI-MIPS. They should probably be renumbered at some point.}
If an instruction could potentially throw more than one capability exception,
the capability exception code is set to the highest priority exception (numerically lowest
priority value) as shown in Table~\ref{table:risc-v-exception-priority}.
\begin{table}
\begin{center}
\begin{tabular}{ll}
\toprule
Priority & Description \\
\midrule
1 & \cappermASR Violation \\
2 & Tag Violation \\
3 & Seal Violation \\
4 & Type Violation \\
5 & \cappermInvoke Violation \\
& \cappermCid Violation \\
6 & \cappermX Violation \\
7 & \cappermL Violation \\
& \cappermS Violation \\
8 & \cappermLC Violation \\
& \cappermSC Violation \\
9 & \cappermSLC Violation \\
10 & \cappermG Violation \\
11 & Length Violation \\
12 & Software-defined Permission Violation \\
\bottomrule
\end{tabular}
\end{center}
\caption{CHERI-RISC-V Capability Exception Priority}
\label{table:risc-v-exception-priority}
\end{table}
\subsection{Virtual Memory and Page Tables}
\label{subsection:riscv:pagetables}
In CHERI-RISC-V, capability addresses are interpreted with respect to the
privilege level of the processor in line with RISC-V's handling of integer
addresses.
%
In Machine Mode, capability addresses are generally interpreted as physical
addresses; if the \texttt{mstatus} \texttt{MPRV} flag is asserted, then data
accesses (but not instruction accesses) will be interpreted as if performed by
lower-privileged modes.
%
In Supervisor and User Modes, capability addresses are interpreted as dictated
by the current \texttt{satp} configuration: addresses are virtual if paging is
enabled and physical if not.
%
% \hmnote{It is more accurate to say that addresses are interpreted as virtual
% addresses IIF SATP.mode != Bare. There could exist M/U processors with no
% virtual addresses, or even SW that, theoretically but not practically runs on
% M/S/U processors that still work with SATP.mode = Bare across all rings.}
In CHERI-RISC-V, we require \cappermASR{} to change
the page-table root (\texttt{satp}) and other virtual-memory parameters.
(In the future, it may be desirable to extend the page-table walking mechanism
to itself utilize capabilities, allowing the walker to be constrained;
see \cref{app:exp:physcap:ptw}.)
It is desirable to extend the Memory Management Unit
to constrain the loading and storing of valid capabilities via specific page
mappings by adding new permission bits to the current Page Table Entry
(PTE) format.
%
Unfortunately, there are no remaining spare bits in the RISC-V Sv32 (32-bit)
PTE format for additional hardware permissions.
(For the purposes of prototyping, we could utilize the two
available software-defined PTE permission bits -- but these are likely to be
used in current operating systems, requiring a longer-term solution.)
%
The Sv39 (39-bit) and Sv48 (48-bit) PTE formats include several reserved bits,
some of which we allocate for use by CHERI-RISC-V; see \cref{fig:riscv:sv39}.
\subsubsection{Capability Stores}
Capability stores are mediated with two bits per PTE, called CW and CD. Their
effect on capability flow parallels the existing W and D bits and is described
by the following table:
\begin{center}
%
\begin{tabular}{ccl}
\textbf{CW} & \textbf{CD} & \textbf{Behavior} \\
0 & X & Trap on capability stores (exception code \riscvstorecappagefault{}) \\
1 & 0 & Capability stores atomically raise CD or fault (as above) \\
1 & 1 & Capability stores permitted
\end{tabular}
%
\end{center}
\noindent Currently, implementations must apply these behaviors to all
instructions which would store an asserted capability tag; that is, they are
dependent on the tag bit. This may be relaxed in future versions of this
specification to all instructions which \emph{could} store an asserted
capability tag, removing the dependence on the tag bit. Instructions which are
able to move only data (and so necessarily clear tags) will not interact with
these PTE flags. CHERI-aware Sv32 implementations, lacking room in their PTEs,
act as though CW and CD are \emph{set}.
As with the existing D bit, there are two permitted approaches for hardware to
take in response to an attempted store with an asserted CHERI tag and through a
PTE with clear CD:
%
\begin{inenum}
%
\item raise a store capability page fault (exception code
\riscvstorecappagefault{}), or
%
\item atomically update the PTE to set CD. In this case, the existing rules
regarding atomicity continue to apply: the PTW must check, atomically, that
the PTE is valid and has W and CW both set, and the PTE update must become
visible no later than the causal store.
%
\end{inenum}
%
Capability-store instructions are still stores and so are expected to check the
W permission, in addition to CW, and to set the existing D bit (or fault if it
is clear, using the existing RISC-V \xcause{} code) in addition to the CD bit
(or fault, using the new capability store/AMO page fault \xcause{} code).
%
The ordering of checks of, and updates to, the PTE follows the scheme of RISC-V
but interdigitates capability mediation: V, U, and W must be checked first,
followed by CW, before any of D and/or CD (and/or A) are atomically asserted or
are used as grounds for faulting. In the latter case, D and/or A take
precedence over CD.
The PTE bits CW and CD have no necessary relationship to any of the CHERI tag
bits on the corresponding physical page. In particular, CD does not reflect
the presence of capabilities on the page, much as D does not reflect anything
about the particular values of data on a page. Software-enforced temporal
safety mechanisms, for example, are anticipated to regularly clear CD (and
even, occasionally, CW) on PTEs referencing pages that nevertheless contain