-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathchap-intro.tex
1142 lines (975 loc) · 58.9 KB
/
chap-intro.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Introduction}
\label{chap:introduction}
CHERI (Capability Hardware Enhanced RISC Instructions) extends
Instruction-Set Architectures (ISAs) with new capability-based primitives that
improve software robustness to security vulnerabilities.
The CHERI model is motivated by the \textit{principle of least privilege},
which argues that greater security can be obtained by minimizing the
privileges accessible to running software.
A second guiding principle is the \textit{principle of intentional use}, which
argues that, where many privileges are available to a piece of software, the
privilege to use should be explicitly named rather than implicitly selected.
While CHERI does not prevent the expression of vulnerable software designs, it
provides strong \textit{vulnerability mitigation}: attackers have a more
limited vocabulary for attacks, and should a vulnerability be successfully
exploited, they gain fewer rights, and have reduced access to further attack
surfaces.
CHERI allows software privilege to be minimized at two granularities:
\begin{description}
\item[Fine-grained code protection]
CHERI enables \textit{fine-grain protection} and \textit{intentional
use} by introducing in-address-space \textit{memory capabilities}, which
replace integer virtual-address representations of code and data pointers.
The aim here is to minimize the rights available to be exercised on an
instruction-by-instruction basis, limiting the scope of damage from inevitable
software bugs.
CHERI capabilities protect the integrity and valid provenance of pointers
themselves, as well as allowing fine-grained protection of the in-memory data
and code that pointers refer to.
These protection
%
policies
% properties
can, to a large extent, be based on information
already present in program descriptions -- e.g., from C- and C++-language types, memory
allocators, and run-time linking.
This application of least privilege and intentional use provides strong,
non-probabilistic
protection against a broad range of memory- and pointer-based vulnerabilities
and exploit techniques -- buffer overflows, format-string attacks, pointer
injection, data-pointer-corruption attacks, control-flow attacks, and so on.
Many of these goals can be achieved through code recompilation on CHERI.
\item[Secure encapsulation]
At a coarser granularity, CHERI also supports
\textit{secure encapsulation} and \textit{intentional use}
through the robust and efficient implementation of highly scalable
in-address-space \textit{software compartmentalization} -- for example,
implementing \textit{object capabilities}.
The aim here is to minimize the set of rights available to
larger isolated software components, building on efficient architectural
support for strong software encapsulation.
These protections are grounded in explicit descriptions of isolation and
communication provided by software authors, such as through explicit software
sandboxing.
There is also the potential to direct compartmentalization through
language- or linker-level structures and annotations, such as class or module
definitions.
This application of least privilege and intentional use mitigates
application-level vulnerabilities, such as logical errors,
downloaded malicious code, or software Trojans inserted in the software supply
chain.
Effective software compartmentalization depends on explicit software
structure, and can require significant code change.
Where compartmentalization already exists in software, CHERI can be used to
significantly improve compartmentalization performance and granularity.
Where that structure is not yet present, CHERI can improve the adoption path
for compartmantalization due to supporting in-address-space
compartmentalization models.
\end{description}
CHERI is designed to support incremental adoption within current
security-critical, C- and C++-language \textit{Trusted Computing Bases (TCBs)}:
operating-system (OS) kernels, key system libraries and services, language
runtimes supporting higher-level type-safe languages, and applications such as
web browsers and office suites.
While CHERI builds on many historic ideas about capability systems (see
Chapter~\ref{chap:historical}), one of the key contributions of this work is
CHERI's \textit{hybrid capability-system architecture}.
In this context, \textit{hybrid} refers to combining aspects from conventional
architectures, system software, and language/compiler choices with
capability-oriented design.
Key forms of hybridization in the CHERI design include:
\begin{description}
\item[A RISC capability system] A capability-system model is blended with a
conventional RISC user-mode architecture without disrupting the majority of
key RISC design choices.
\item[An MMU-enabled capability system] A capability-system model is cleanly
and usefully composed with conventional ring-based privilege and
virtual memory implemented by processor MMUs (Memory Management Units).
\item[A C/C++-language capability system] CHERI can be targeted by a
C/C++-language compiler with strong compatibility, performance, and
protection properties.
\item[Hybrid system software] CHERI supports a range of OS models including
conventional MMU-based virtual-memory designs, hybridized designs that
host capability-based software within multiple virtual address spaces, and
pure single-address-space capability systems.
\item[Incremental adoptability] Within pieces of software, capability-aware
design can be disregarded, partially adopted, or fully adopted with useful
and predictable semantics.
This allows incremental adoption within large software bases, from OS
kernels to application programs.
\end{description}
We hope that these hybrid aspects of the design will support gradual
deployment of CHERI features in existing software, rather than obliging a
clean-slate software design, thereby offering a more gentle hardware-software
adoption path.
In the remainder of this chapter, we describe our high-level design goals
for CHERI, the notion that CHERI is an architecture-neutral protection model
with architecture-specific mappings (such as CHERI-RISC-V and Arm's Morello)%
\psnote{(the CHERI-RISC-V mapping detailed here, Arm's experimental Morello architecture~\cite{arm-morello}, and the sketch mapping for x86 described in Chapter~\ref{chap:cheri-x86-64})}, an
introduction to the CHERI-RISC-V concrete instantiation, a brief version
history, an outline of the remainder of this report, and our publications to
date on CHERI.
A more detailed discussion of our research methodology, including motivations,
threat model, and evolving approach from ISA-centered prototyping to a broader
architecture-neutral protection model may be found in
Chapter~\ref{chap:research}.
Historical context and related work for CHERI may be found in
Chapter~\ref{chap:historical}.
The \hyperref[glossary]{Glossary} at the end of the report contains
stand-alone definitions of many key ideas and terms, and may be useful
reference material when reading the report.
\section{CHERI Design Goals}
CHERI has three central design goals aimed at dramatically improving the
security of contemporary C-language TCBs, through processor support for
fine-grained memory protection and scalable software compartmentalization,
whose (at times) conflicting requirements have required careful negotiation in
our design:
\begin{description}
\item[Fine-grained memory protection]
improves software resilience to escalation paths
that allow low-level software bugs involving individual data
structures and data-structure manipulations to be coerced into more powerful
software vulnerabilities;
e.g., through remote code injection via buffer overflows, control-flow and
data-pointer corruption, and other memory-based techniques.
Unlike MMU-based memory protection, CHERI memory protection is intended to
be driven by the compiler in protecting programmer-described data structures
and references, rather than via coarse page-granularity protections.
When C and C++ pointers are implemented using capabilities, capability
protections constrain the the ranges of memory (via bounds) and operations
that can be performed (via permissions).
They also protect the integrity, provenance, and monotonicity of pointers in
order to prevent inadvertent or inappropriate manipulation
that might otherwise lead to privilege escalation.
Capabilities can be used to implement a variety of language-level pointer
types including heap, stack, global, thread-local, and function pointers.
These protect against application-level mismanipulation and misuse of
source-visible pointers, including out-of-range accesses to explicit or
implied memory allocations, and corruption of pointers within those
allocations.
Capabilities can also be used to implement sub-language pointers created and
maintained by the compiler, runtime, and operating system, such as the stack
pointer, control-flow pointers such as return addresses, and the pointers to
internal linkage structures such as Procedure Linkage Tables (PLTs) and the
Global Offset Table (GOT).
Collectively, strong pointer integrity, pointer provenance validity, bounds,
permissions, monotonicity, and encapsulation prevent corrupted pointers from
allowing a range of vulnerable behaviors (e.g., buffer overflows) and also
directly impede common exploit techniques (e.g., pointer injection).
Fine-grained protection also provides the foundation for expressing
compartmentalization within application instances.
We draw on, and extend, ideas from recent work in C-language {\em software
bounds checking} by combining {\em fat pointers} with capabilities, allowing
capabilities to be substituted for C pointers with only limited changes to
program semantics.
CHERI permits efficient implementation of dialects of C and C++ in
which various invalid accesses, deemed to be undefined behavior in
those languages, and potentially giving arbitrary behavior in
their implementations, are instead guaranteed to throw an exception.
While CHERI has not been specifically designed with other languages in mind,
there appear to be many potential applications of CHERI within the language
runtimes for higher-level and managed languages.
\item[Software compartmentalization]
involves the decomposition of software (at present, primarily
application software)
into isolated components to
mitigate the effects of security vulnerabilities by applying
sound principles of security, such as abstraction, encapsulation,
type safety, and especially least privilege and the minimization of
what must be trustworthy (and therefore sensibly trusted!).
Previously, it seems that the
adoption of compartmentalization has been limited by a conflation of hardware
primitives for virtual addressing and separation, leading to inherent performance and
programmability problems when implementing fine-grained separation.
With CHERI, we seek to decouple the virtualization from separation to avoid
scalability problems imposed by MMUs based on translation look-aside buffers
(TLBs), which impose a very high performance penalty as the number of
protection domains increases, as well as complicating the writing of compartmentalized software.
As with an MMU, CHERI enables a variety of \textit{software operational
models} for compartmentalization.
Taking advantage of CHERI's strong in-address-space protection, we have
explored both intra-process sandboxing models, providing robust encapsulation
for dynamic libraries, and also acceleration of the traditional process model
(\textit{co-processes}) by colocating multiple separated processes within the
same virtual address space.
\item[Formal modeling and verification]
We draw on {\it formal methodologies} wherever feasible, to improve our confidence in the design and implementation of CHERI.
This use is necessarily subject to real-world constraints of timeline, budget, design process,
and prototyping, but it has helped increase our confidence that CHERI meets our
functional and security requirements.
Formal methods can also help to avoid many of
the characteristic design flaws that are common in both hardware and software.
This desire requires us not only to perform research into CPU and software design, but also to
develop new formal methodologies, and adaptations and extensions of existing ones.
To this end, we have produced formal models of our instruction-set extensions
for multiple architectures, and used both pragmatic SMT-based validation and
formal proof to validate that the models satisfy essential security
properties such as provenance validity and monotonicity.
\item[A viable transition path] must be applicable to current software and
hardware designs.
CHERI hardware must be able to run most current software without significant
modification, and allow incremental deployment of security
improvements starting with the most critical software components: the TCB
foundations on which the remainder of the system rests, and software with the
greatest exposure to risk.
CHERI's features must significantly improve security, to create demand for
upstream processor manufacturers from their downstream mobile and embedded
device vendors.
These CHERI features must at the same time conform to vendor expectations
for performance, energy use, and compatibility in order to compete with less
secure alternatives.
CHERI is therefore necessarily disruptive, but its architectural,
microarchitectural, and software interventions are bounded in scope, and
constrained in their implications.
Introducing architectural capabilities and tagged memory has significant
implications for instruction-set design and maintainability, as well as
microarchitectural implications for the processor design and memory subsystem.
On the other hand, the general computational, operating-system, and software
compilation models are largely retained, and most existing architectural,
microarchitectural, and software implementation choices are preserved.
For example, although tagged capabilities are used instead of unadorned
integers in the instruction-level descriptions of loads and stores, once
checks are performed capabilities are largely reduced to integer addresses
for processing by the MMU and memory subsystem.
Compilers must use capability-constrained loads and stores for capability-aware
code, but the structure (and often format) of those instructions remain the
same, and although code is generated to use capabilities rather than integers
for pointers, the vast majority of source code remains the same.
Running native capability-unaware binary code is specifically designed for and
supported, allowing conventional OS compatibility ABI techniques (such as used
in the 32-bit to 64-bit transition) to be used.
\end{description}
We are concerned with satisfying the need for
trustworthy systems and networks, where {\it
trustworthiness} is a multidimensional measure of how well a system or
other entity satisfies its various requirements -- such as those for
security, system integrity, and reliability, as well as human safety,
and total-system survivability,
robustness, and resilience, notably in the presence of a wide range of
adversities such as hardware failures, software flaws, malware,
accidental and intentional misuse, and so on. Our approach to
trustworthiness encompasses hardware and software architecture,
dynamic and static evaluation, formal and non-formal analyses, good
software-engineering practices, and much more.
\section{Architecture Neutrality and Architectural Instantiations}
\label{sec:archneut}
CHERI is an \textit{architecture-neutral protection model} in that, like
virtual memory, it can be deployed within multiple ISAs.
Our initial mapping into the 64-bit MIPS ISA allowed us to develop the
CHERI approach; using it, we explored the implications downwards into
the microarchitecture, and upwards into the software stack.
Having developed a mature hardware-software protection model, we used this as
the baseline to derive an architecture-neutral CHERI protection model.
This architecture-neutral model is discussed in detail in
Chapter~\ref{chap:architecture}.
We have since added CHERI protection to the RISC-V ISA
(Chapter~\ref{chap:cheri-riscv}), developed a
lightweight
architectural sketch for the x86-64 ISA
(Chapter~\ref{chap:cheri-x86-64}),
and collaborated with Arm as they have developed the experimental Morello
architecture, an application of CHERI to the ARMv8-A
architecture~\cite{arm-morello}.
Over the course of this evolution, we have attempted to maximize the degree to
which specification is architecture neutral, and minimize the degree to which
it is architecture specific.
Even within a single ISA, there are multiple potential instantiations of the
CHERI protection model, which offer different design tradeoffs -- for example,
decisions about whether to have separate integer and capability register files
or to merge them into a single register file.
The successful mapping into multiple ISAs has led us to believe that the CHERI
protection model is a portable protection model, that supports portable
software stacks in much the same way that portable virtual-memory-based
operating systems can be implemented across a variety of architectural MMUs.
Unlike MMUs, whose software interactions are primarily with the operating
system, CHERI interacts directly with compiler-generated code, key system
libraries, compartmentalization libraries, and applications; across all of
these, we have found that an architecture-neutral approach can be highly
effective, offering portability to the vast majority of CHERI-aware C/C++
code.
In this report, we first consider the architecture-neutral model, and then
applications of our approach in specific ISAs.
\subsection{The Architecture-Neutral CHERI Protection Model}
\label{sec:cheri-protection-model}
The aim of the CHERI protection model, as embodied in both the software stack
(see Chapter~\ref{chap:model}) and architecture (see
Chapter~\ref{chap:architecture}), is to support two vulnerability mitigation
objectives: first,
fine-grained pointer and memory protection within address spaces, and second,
primitives to support both scalable and programmer-friendly
compartmentalization within address spaces.
The CHERI model is designed to support low-level TCBs, typically implemented
in C or a C-like language, in workstations, servers, mobile devices, and
embedded devices.
In contrast to MMU-based protection, this is done by protecting
\textit{references to code and data} (pointers), rather than the
\textit{location of code and data} (virtual addresses).
This is accomplished via an \textit{in-address-space capability-system model}:
the architecture provides a new primitive, the \textit{capability}, that
software components (such as the OS, compiler, run-time linker,
compartmentalization runtime, heap allocator, etc.) can use to implement
strongly protected pointers within virtual address spaces.
%Simultaneously, it will provide reasonable assurance of correctness and a
%realistic technology transition path from existing hardware and software
%platforms.
In the security literature, capabilities are tokens of authority that are
unforgeable and delegatable.
\textit{CHERI capabilities} are integer virtual addresses that have been
extended with metadata to protect their integrity, limit how they are
manipulated, and control their use.
This metadata includes a \textit{tag} implementing strong integrity
protection (differentiating valid and invalid capabilities), \textit{bounds}
limiting the range of addresses that may be
dereferenced, \textit{permissions} controlling the specific operations that
may be performed, and also
\textit{sealing}, used to support higher-level software encapsulation.
Protection properties for capabilities include the ISA ensuring that
capabilities are always derived via valid manipulations of other capabilities
(\textit{provenance}), that corrupted in-memory capabilities cannot be
dereferenced (\textit{integrity}), and that rights associated with
capabilities are non-increasing (\textit{monotonicity}).
CHERI capabilities may be held in registers or in memories, and are loaded,
stored, and dereferenced using CHERI-aware instructions that expect
capability operands rather than integer virtual addresses.
On hardware reset, initial capabilities are made available to software via
special and general-purpose capability registers.
All other capabilities will be derived from these initial valid capabilities
through valid capability transformations.
In order to continue to support non-CHERI-aware code, dereference of integer
virtual addresses via legacy instruction is transparently checked via a
\textit{default data capability} (\DDC{}) for loads and stores, or a
\textit{program-counter capability} (\PCC{}) for instruction fetch.
A variety of programming-language and code-generation models can be used with
a CHERI-extended ISA.
As integer virtual addresses continue to be supported, C or C++ compilers
might choose to always implement pointers via integers, selectively implement
certain pointers as capabilities based on annotations or type information
(i.e., a \text{hybrid C} interpretation), or
alternatively
always implement pointers as
capabilities except where explicitly annotated (i.e., a
\textit{pure-capability} interpretation).
Programming languages may also employ capabilities internal to their
implementation: for example, to protect return addresses, vtable pointers, and
other virtual addresses for which capability protection can provide enhanced
vulnerability mitigation.
When capabilities are being used to implement pointers (e.g., to code or data)
or internal addresses (e.g., for return addresses), they must be
constructed with suitably restricted rights, to accomplish effective protection.
This is a run-time operation performed using explicit instructions (e.g., to
set bounds, mask permissions, or seal capabilities) by the operating system,
run-time linker, language runtime and libraries, and application code itself:
\begin{description}
\item[The operating-system kernel] may narrow bounds and permissions on
pointers provided as part of the start-up environment when executing a
program binary (e.g., to arguments or environmental variables), or when
returning pointers from system calls (e.g., to new memory mappings).
\item[The run-time linker] may narrow bounds and permissions when setting up
code pointers or pointers to global variables.
\item[The system library] may narrow bounds and permissions when returning a
pointer to newly allocated heap memory.
\item[The compartmentalization runtime] may narrow bounds and permissions,
as well as seal capabilities, enforcing compartment isolation (e.g., to act
as sandboxes).
\item[The compiler] may insert instructions to narrow bounds and permissions
when generating code to take a pointer to a stack allocation, or when taking
a pointer to a field of a larger structure allocated as a global, on the
stack, or on the heap.
\item[The language runtime] may narrow bounds and permissions when returning
pointers to newly allocated objects, or when setting up internal linkage, as
well as seal capabilities to non-dereferenceable types.
\item[The application] may request changes to permissions, bounds, and other
properties on pointers, in order to further subset memory allocations and
control their use.
\end{description}
The CHERI model can also be used to implement other higher-level protection
properties.
For example, tags on capabilities in memory can be used to support accurate
C/C++-language temporal safety via revocation or garbage collection, and
sealed capabilities can be used to enforce language-level encapsulation and
type-checking features.
The CHERI protection model and its implications for software security are
described in detail in Chapter~\ref{chap:model}.
\subsection{An Architecture-Specific Mapping into RISC-V}
\label{sec:cheri-isa-design}
\psnote{much of the following, eg tagged memory and the notion of capability instructions, seems arch-neutral to me. I'd move all that into the previous section, and then present the four architectures more symmetrically}
\jhbnote{The one risk of that change is that it becomes close to
duplicating the arch-neutral chaper entirely. I think it can still
make sense to keep this section as a representative case.}
The CHERI-RISC-V ISA (see Chapter~\ref{chap:cheri-riscv}) is an instantiation of
the CHERI protection model to the 32-bit and 64-bit variants of the open-source RISC-V architecture.
CHERI adds the following features to support granular memory protection
and compartmentalization within address spaces:
\begin{description}
\item[Capability registers] describe the rights ({\em protection domain}) of
the executing thread to
access memory, and to invoke object references
to transition between protection domains.
We extend the existing general-purpose integer registers as well as
various CSRs to hold capabilities.
Capability registers contain a tag, object type, permission mask, base, length, and offset (allowing the description of not just a bounded region, but also a pointer into that region, improving C-language compatibility).
Capability registers are suitable for describing both data and code, and can hence protect both data integrity/confidentiality and control flow.
\item[Capability instructions] allow executing code to create, constrain (e.g., by reducing bounds or permissions), manage, and inspect capability register values. Both unsealed (memory) and sealed (object) capabilities can be loaded and stored via memory capability registers (i.e., dereferencing).
Object capabilities can be invoked, via special instructions, allowing a transition between protection domains, but are {\em immutable} and {\em non-dereferenceable}, providing encapsulation of the code or data that they refer to.
Capability instructions implement {\em guarded manipulation}: invalid capability manipulations (e.g., to increase rights or length) produce a capability with a cleared tag that can no longer be dereferenced, and invalid capability dereferences (e.g., to access outside of a bounds-checked region) result in an exception that can be handled by the supervisor or language runtime.
A key aspect of the instruction-set design is \textit{intentional use of
capabilities}: explicit capability registers, rather than ambient
authority, are used to indicate exactly which rights should be exercised, to
limit the damage that can be caused by exploiting bugs.
Tradeoffs exist around intentional use, and in some cases compatibility or
opcode utilization may dictate implicit capability selection; for example,
legacy RISC-V load and store instructions implicitly dereference a Default
Data Capability as they are unable to explicitly name a capability register.
Most capability instructions are part of the user-mode ISA, rather than the privileged ISA, and will be generated by the compiler to describe application data structures and protection properties.
\item[Tagged memory] associates a 1-bit tag with each capability-aligned and capability-sized word in physical memory, which allows capabilities to be safely loaded and stored in memory without loss of integrity.
Writes to capability values in memory that do not originate from a valid capability in the capability register file will clear the tag bit associated with that memory, preventing accidental (or malicious) dereferencing
of invalid capabilities.
This functionality expands a thread's effective protection domain to include the transitive closure of capability values that can be loaded via capabilities via those present in its register file. For example, a capability register representing a C pointer to a data structure can be used to load further capabilities from that structure, referring to further data structures, which could not be accessed without suitable capabilities.
Non-bypassable tagging of unforgeable capabilities enables not only reliable
and secure enforcement of capability properties, but also reliable and secure
identification of capabilities in memory for the purposes of implementing
other higher-level protection properties such as temporal safety.
\end{description}
\psnote{The first sentence is a bit arch -- does anyone in the last couple of decades really think x86 instructions are intended for use directly by the programmer? Probably better to de-RISC the discussion.}
In keeping with the RISC philosophy, CHERI instructions are intended for use
primarily
by the operating system and compiler rather than directly by the programmer, and consist of relatively simple instructions that avoid (for example) combining memory access and register value manipulation in a single instruction.
In our current software prototypes, there are direct mappings from programmer-visible C-language pointers to capabilities in much the same way that conventional code generation translates pointers into general-purpose integer register values; this allows CHERI to continuously enforce bounds checking, pointer integrity, and so on.
There is likewise a strong synergy between the capability-system
model, which espouses a separation of policy and mechanism\psnote{does
it really? that's not very clear to me, for CHERI, esp. for the
fine-grain-protection case (it seems more true for the sealed
capability stuff)}, and RISC: CHERI's features make possible
the implementation of a wide variety of OS, compiler, and application-originated policies on a common protection substrate that optimizes fast paths through hardware support.
CHERI-RISC-V applies CHERI protections in different ways compared to
our initial CHERI-MIPS architecture. Some of these differences arise
from differences in the base ISA; we anticipate that
adaptations of CHERI to ISAs will adopt
conventions such as instruction-encoding
in keeping with their specific flavor and design.
Other design decisions reflect maturity of the CHERI model and lessons
learned from CHERI-MIPS.
In our initial work on CHERI, we utilized an uncompressed capability format
in which each capability was 256 bits in size.
This gave us significant flexibility to experiment with capability contents
and semantics for bounds checking and capability behaviors.
As our protection and software models matured, we turned our attention to the
performance implications of large capability sizes, including potentially
substantial data-cache overhead for pointer-intensive applications.
We adapted an approach explored in fat-pointer research -- bounds compression
exploiting redundancy between a pointer value and its bounds -- to implement
128-bit compressed capabilities.
As of CHERI ISAv8, capabilities are assumed to use compressed bounds whether
based on a 32-bit or 64-bit base address size.
CHERI-MIPS used a separate \textit{capability register file} to hold
capability registers rather than extending general-purpose registers.
CHERI-RISC-V instead extends general-purpose registers to hold
capabilities.
Wherever possible, CHERI systems make use of existing hardware designs: processor pipelines and register files, cache memory,
system buses, commodity DRAM, and commodity peripheral devices such as
NICs and display cards.
We are currently focusing on enforcement of CHERI security properties on applications
running on a general-purpose processor; in future work, we hope to consider the effects of
implementing CHERI in peripheral processors, such as those found in Network Interface Cards (NICs) or Graphical Processing Units (GPUs).
\subsection{CHERI-x86-64 and Arm Morello}
The abstract CHERI memory protection and security models described above have
been applied to two other architectures:
\begin{description}
\item[CHERI-x86-64] is a sketch application of CHERI to the Intel x86-64
architecture (Chapter~\ref{chap:cheri-x86-64}).
It is not fully elaborated, and has not been implemented, but serves to
demonstrate how CHERI could be applied to a CISC architecture.
\item[Arm Morello] is an experimental application of CHERI to the ARMv8-A
architecture~\cite{arm-morello}. Developed by Arm, Morello is a complete
integration into
a rich commercially used load-store architecture that includes features
such as vector instructions and virtualization support.
We have ported our complete CHERI software stack to Morello.
\end{description}
There is a high degree of source-level compatibility between software across
all CHERI architectures.
Compilers and low-level operating-system components necessarily have modest
amounts of architecture-specific code generation, assembly code, and status
or control registers.
However, high-level CHERI-aware systems C/C++ code is entirely portable across
the multiple architectures.
This is comparable to the portability of architecture-neutral virtual-memory
abstractions and APIs despite architecture-specific interfaces to MMUs and
TLB management.
\section{Deterministic Protection}
CHERI has been designed to provide strong, non-probabilistic protection rather
than depending on short random numbers or truncated cryptographic hashes that
can be leaked and reinjected, or that could be brute forced.
Essential to this approach is using out-of-band memory tags that prevent
confusion between data and capabilities.
Software stacks can use these features to construct higher-level protection
properties, such as preventing the transmission of pointers via Inter-Process
Communication (IPC) or network communications.
They are also an essential foundation to strong compartmentalization, which
assumes a local adversary.
\section{Formal Modeling and Provable Protection}
The design process for CHERI has used formal semantic models as an
important tool in
various ways.
%
Our
goal here has been to understand how we can support the CHERI design
and engineering process with judicious use of mathematically rigorous
methods, both in lightweight ways (providing engineering and assurance
benefits without the costs of full formal verification), and using
machine-checked proof to establish high confidence that the
architecture design provides specific security properties.
\jhbnote{Do we want to trim mention of L3 and/or CHERI-MIPS here?}
The basis for all this has been use of formal specifications of the
ISA instruction behavior as a fundamental design tool, initially for
CHERI-MIPS in L3~\cite{Fox2015}, and now for CHERI-MIPS and
CHERI-RISC-V in Sail~\cite{sail-popl2019}. L3 and Sail are
domain-specific languages specifically designed for expressing
instruction behavior, encoding data, etc.
Simply moving from the informal pseudocode commonly used to describe
instruction behavior to parsed and type-checked
artifacts already helps maintain clear specifications.
The CHERI-RISC-V instruction descriptions in Chapter~\ref{chap:isaref-riscv}
are automatically included from the Sail model, keeping documentation
and model in sync.
\psnote{Robert: do we want to include pointers to the model repos somewhere?}
Both L3 and Sail support automatic generation of executable models
(variously in SML, OCaml, or C) from these specifications. These
executable models have been invaluable, both as golden models for
testing our hardware prototypes, and as emulators for testing CHERI
software above. The fact that they are automatically generated from
the specifications again helps keep things in sync, enabling
regression testing on any change to the specification, and makes for
easy experimentation with design alternatives.
The generated emulators run fast enough to boot FreeBSD in a few minutes (booting
cheribsd
%128-bit cheri-mips model and it
%80e6 instructions
currently takes around 250s, roughly 320kips).
We have also used the models to automatically generate ISA test cases,
both via simple random instruction generation, and using
theorem-prover and SMT approaches~\cite{DBLP:journals/scp/CampbellS16}.
Finally, the models support formal verification, with mechanised
proof, of key architectural security properties.
L3 and Sail support automatic generation of versions of the models in
the definition languages of (variously) the
HOL4, Isabelle, and Coq theorem provers, which we have used as a basis
for proofs.
Key architectural verification goals including proving not just low-level
properties, such as the monotonicity of each individual instruction
and properties of the CHERI Concentrate compression scheme, but also
higher-level goals such as compartment monotonicity, in which arbitrary code
sequences isolated within a compartment are unable to construct additional
rights beyond those reachable either directly via the register file or
indirectly via loadable capabilities.
We have proven a number of such properties about the CHERI-MIPS ISA~\cite{cheri-formal-SP2020,UCAM-CL-TR-940}.
The CHERI design process has also benefitted from an interplay with
our work on rigorous semantics for C~\cite{Cerberus-PLDI16,cerberus-popl2019}.
\section{CHERI ISA Version History}
\psnote{the following describes the history in \emph{four} different ways. Surely at least one of those can go?}
\begin{table}[th!]
\begin{center}
\caption{CHERI ISA revisions and major development phases}
\input{cheri-version-table}
\end{center}
\label{table:intro-cheri-isa-versions}
\end{table}
This is the tenth version of the CHERI ISA specification document.
A high-level summary of CHERI ISA versions and their corresponding
contributions can be found in Table~\ref{table:intro-cheri-isa-versions}.
A much more detailed version summary and complete change log can be found in
Appendix~\ref{app:versions}.
A more narrative exploration of the research and development cycle leading to
our current specification can be found in Chapter~\ref{chap:research}.
\subsection{Changes in CHERI ISA 10.0}
\input{app-versions-10-0}
\section{Experimental Features}
\label{sec:intro:experimental}
\rwnote{Ensure this is updated for ISAv10}
Appendix~\ref{app:experimental} describes a number of experimental features
that extend CHERI with new functionality.
These include several architectural features:
\begin{itemize}
\item Efficient tag rederivation for use with swapping, memory compression,
memory encryption, and virtual-machine migration
\item A number of architectural features to accelerate temporal memory safety
and capability revocation: fast capability subset testing, non-temporal
tag loading, and non-temporal capability loading.
\item A recursive mutable load permission that limits the store rights
via future capability loads
\item More efficient capability permission representations
\item Memory versioning for use with capabilities
\item Linear capabilities
\item Indirect capabilities
\item Indirect sealed entry capabilities
\item Capability coloring for capability flow control
\item Sealing with large object type fields in memory
\item A system for mixing 64-bit and 128-bit capabilities
\item Capabilities referencing physical addresses
\item Use of capabilities across a system for peripherals and accelerators
\end{itemize}
We believe that these represent interesting, and in some cases promising,
portions of the design space beyond the baseline CHERI.
However, they appear in an appendix because: (1) we do not yet recommend their
use; (2) they have not been thoroughly evaluated across architecture,
hardware, and software with respect to utility, security, compatibility,
microarchitectural realism, nor performance; and/or (3) their preservation of
essential CHERI security properties has not been formally proven.
They are therefore included to provide insight into potential future
directions or interesting potential alternative points in the overall design
space.
\section{Document Structure}
This document is an introduction to, and a reference manual for, the CHERI
protection model and instruction-set architecture.
\medskip
\noindent
Chapter~\ref{chap:introduction} introduces the CHERI protection model, our
architecture-neutral approach, and the specific CHERI-RISC-V ISA.
\medskip
\noindent
Chapter~\ref{chap:model} describes the high-level model for the CHERI approach
in terms of architectural features, software protection objectives, and
software mechanism.
\medskip
\noindent
Chapter~\ref{chap:architecture} provides a detailed description of
architecture-neutral aspects of the CHERI protection model, including
capability and tagged-memory models, categories of new instructions, etc.
\medskip
\noindent
Chapter~\ref{chap:cheri-riscv} describes an architecture-specific mapping
of the CHERI protection model into the 32-bit and 64-bit RISC-V architecture.
This includes specification of the CHERI-RISC-V architecture extension,
register file, Memory Management Unit (MMU), privilege models, and other
ISA-specific semantics.
\medskip
\noindent
Chapter~\ref{chap:cheri-x86-64} provides an ``architectural sketch'' of how
the CHERI protection model might be mapped into the x86-64 ISA, a decidedly
non-RISC instruction set.
\medskip
\noindent
Chapter~\ref{chap:isaref-riscv} provides a detailed description of each
CHERI-RISC-V instruction.
\medskip
\noindent
Chapter~\ref{chap:isaref-x86-64} provides a detailed description of each
CHERI-x86-64 instruction.
\medskip
\noindent
Chapter~\ref{chap:rationale} discusses the design rationale for many aspects
of the CHERI ISA, as well as our thoughts on future refinements based on
lessons learned to date.
\psnote{The following doesn't belong with this chapter description, so
I'd just delete it. And that draft
seems never to have made it to an actual UCAM technical report, so
we shouldn't cite it as such in any case. We mention the early PVS
work in Chapter~\ref{chap:research}.
``A broader document on our use of formal methods
relating to the CHERI total-system architecture is in draft
form~\cite{CHERI-FM15}.''}
\medskip
\noindent
Chapter~\ref{chap:assurance} outlines a detailed (but not formally
proved) argument for
why a reference monitor above CHERI provides certain security
properties, and touches on some issues in the specification that
formal proof has to deal with.
\psnote{I've rewritten this, because that chapter is no longer ``how we have used formal
methodology to ensure correctness of aspects of the CHERI ISA.''}
\medskip
\noindent
Chapter~\ref{chap:microarchitecture} provides a high-level summary of how
CHERI can be integrated into RISC pipelines and the memory subsystem.
\medskip
\noindent
Chapter~\ref{chap:research} describes the motivations and hardware-software
co-design research approach taken in developing CHERI, including major phases of
the research.
\psnote{TODO: do we want to include the hardware-software co-design to hardware-software-formal
co-design narrative, in brief, in the intro chapter? And/or in
detail in Chapter~\ref{chap:research}?}
\medskip
\noindent
Chapter~\ref{chap:historical} describes the historical context for this work,
including past systems that have influenced our approach.
\medskip
\noindent
Chapter~\ref{chap:conclusion} discusses our short- and long-term plans for the
CHERI protection model and CHERI ISAs, considering both our specific plans
and open research questions that must be answered as we proceed.
\medskip
\noindent
Appendix~\ref{app:versions} provides a more detailed version history of the
CHERI protection model and CHERI ISAs.
\medskip
\noindent
Appendix~\ref{app:isaquick-riscv} is a quick reference for CHERI-RISC-V
instructions and encodings.
\medskip
\noindent
Appendix~\ref{app:experimental} specifies a number of CHERI instructions
that we still consider experimental, and hence are not included in the main
specification.
\medskip
\noindent
Appended~\ref{app:cheri-128} describes our prior (now deprecated) CHERI-128
compression scheme, which has been superseded by CHERI Concentrate.
\medskip
\noindent
The report also includes a \hyperref[glossary]{Glossary} defining many key
CHERI-related terms.
\medskip
Future versions of this document will continue to expand our
consideration of the CHERI model and CHERI instruction-set architectures,
their impact on software, and evaluation strategies and results.
Additional information on our prototype CHERI hardware and software
implementations, as well as formal methods work, can be found in accompanying
reports.
\section{Publications}
\label{sec:publications}
As our approach has evolved, and project developed, we have published a number
of papers and reports describing aspects of the work.
Our conference papers contain greater detail on the rationale for various
aspects of our hardware-software approach, along with evaluations of
micro-architectural impact, software performance, compatibility, and security:
\psnote{It'd be useful to add links here to the pdfs of these}
\begin{itemize}
\item In the International Symposium on Computer Architecture (ISCA 2014), we
published \citetitleit{woodruff:cheriisca2014}~\cite{woodruff:cheriisca2014}.
This paper describes our architectural and micro-architectural approaches
with respect to capability registers and tagged memory, hybridization with a
conventional Memory Management Unit (MMU), and our high-level software
compatibility strategy with respect to operating systems.
\item In the International Conference on Architectural Support for Programming
Languages and Operating Systems (ASPLOS 2015), we published
\citetitleit{ChisnallCPDP11}~\cite{ChisnallCPDP11}, which extends our architectural approach to
better support convergence of pointers and capabilities, as well as to
further explore the C-language compatibility and performance impacts of
CHERI in larger software corpora.
\item In the IEEE Symposium on Security and Privacy (IEEE S\&P, or
``Oakland'', 2015), we
published \citetitleit{watson15:cheri}~\cite{watson15:cheri}, which
describes a hardware-software architecture for mapping compartmentalized
software into the CHERI capability model, as well as extends our explanation
of hybrid operating-system support for CHERI.
\item In the ACM Conference on Computer and Communications Security (CCS
2015), we published \citetitleit{gudka15:soaap}~\cite{gudka15:soaap},
which describes our higher-level design
approach to software compartmentalization as a a form of vulnerability
mitigation, including static and dynamic analysis techniques to validate the
performance and effectiveness of compartmentalization.
\item In the ACM SIGPLAN Conference on Programming Language Design and
Implementation (PLDI 2016), we published
\citetitleit{Cerberus-PLDI16}~\cite{Cerberus-PLDI16}, which develops a
formal semantics for the C programming language.
As part of that investigation, we explore the effect of CHERI on C
semantics, which led us to refine a number of aspects of CHERI code
generation, as well as refine the CHERI ISA. In the other
direction, understanding the changes needed to port existing
software to CHERI has informed our views on what C semantics should be.
\item In the September-October 2017 issue of IEEE Micro, we published
\citetitleit{watson2016:microjournal}~\cite{watson2016:microjournal},
expanding on architectural
and microarchitectural aspects of the CHERI object-capability
compartmentalization model described in our Oakland 2015 paper.
\item In the International Conference on Architectural Support for Programming
Languages and Operating Systems (ASPLOS 2017), we published
\citetitleit{chisnall2017:cherijni}~\cite{chisnall2017:cherijni}.
This paper describes how to use CHERI memory safety and compartmentalization
to isolate Java Native Interface (JNI) code from the Java Virtual Machine,
imposing the Java memory and security model on native code.
\item In the MIT Press book, \textit{New Solutions for Cybersecurity}, we
published two chapters on CHERI. \citetitleit{watson2017:cheri-deployability}
discusses our research and development approach, and how
CHERI hybridizes conventional architecture, microarchitecture, operating
systems, programming languages, and general-purpose software designs with
a capability-system model~\cite{watson2017:cheri-deployability}.
\textit{Fundamental Trustworthiness Principles in CHERI} discusses how CHERI
fulfills a number of critical trustworthiness
principles~\cite{neumann2017:cheri-principles}.
\item In the International Conference on Computer Design (ICCD 2017), we
published \citetitleit{joannou2017:tagged-memory}~\cite{joannou2017:tagged-memory}.
This paper describes how awareness of the architectural semantics of tagged
pointers can be used to improve performance and reduce DRAM access overheads
for tagging implemented over DRAM without innate tag storage.
\item In the International Conference on Computer Design (ICCD 2018), we
published \citetitleit{xia:cherirtos}~\cite{xia:cherirtos}.
This paper describes an embedded variant on CHERI using 64-bit capabilities
for 32-bit addresses, and how embedded real-time operating systems might
utilize CHERI features.
\item In the ACM SIGPLAN Symposium on Principles of Programming Languages
(POPL 2019), we published \citetitleit{sail-popl2019}, which describes
a formal modeling approach and formal models
for several instruction sets including CHERI-MIPS~\cite{sail-popl2019}.
\item In the ACM SIGPLAN Symposium on Principles of Programming Languages
(POPL 2019), we published \citetitleit{cerberus-popl2019}, describing a
formal model for C pointer provenance and its
practical evaluation, including via pure-capability C code on the CHERI
architecture~\cite{cerberus-popl2019}.
\item In the International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS 2019), we published
\citetitleit{davis2019:cheriabi}~\cite{davis2019:cheriabi}.
This paper describes how to adapt a full MMU-based OS design to support
ubiquitous use of capabilities to implement C and C++ pointers in userspace.
\item In IEEE Transactions on Computers, we published \citetitleit{Woodruff2019}~\cite{Woodruff2019}.
This paper describes our compressed 128-bit and 64-bit capability formats,
evaluating the effects of precision loss in bounds, and the potential
performance impact of the approach.
\item In the IEEE/ACM International Symposium on Microarchitecture (IEEE
MICRO 2019), we published \citetitleit{Xia_CHERIvokeCharacterisingPointer_2019}~\cite{Xia_CHERIvokeCharacterisingPointer_2019}.
This paper performs a simulation study of a potential approach to temporal
memory safety using CHERI.
\item In the IEEE Symposium on Security and Privacy (Oakland 2020), we
published \citetitleit{cheri-formal-SP2020}~\cite{cheri-formal-SP2020}.
This paper describes formal modeling and proof of isolation properties for
the CHERI-MIPS ISA.
\item In the IEEE Symposium on Security and Privacy (Oakland 2020), we
published \citetitleit{cornucopia}~\cite{cornucopia}.
This paper describes a full hardware-software implementation of temporal
memory safety for CHERI, including architectural accelerations.
\item In the proceedings of Hardware and Architectural Support for Security
and Privacy (HASP 2020), we published \citetitleit{DBLP:conf/micro/MarkettosBBNMW20}~\cite{DBLP:conf/micro/MarkettosBBNMW20}.
This paper proposes new solutions that can efficiently address the problem
of malicious memory access from pluggable computer peripherals and
microcontrollers embedded within a system-on-chip.
\item In Workshop on Computer Architecture Research with RISC-V (CARRV 2021),
we published \citetitleit{fuchs2021framework}~\cite{fuchs2021framework}.
This paper presents a flexible and extensible bare-metal test suite containing replications of all major transient-execution attacks in RISC-V.
\item In the proceedings of the 31st European Symposium on Programming
(ESOP 2022) we published \citetitleit{DBLP:conf/esop/BauereissCSAESB22}~\cite{DBLP:conf/esop/BauereissCSAESB22}.
In this paper we define the fundamental security property that Morello aims
to provide, reachable capability monotonicity, and prove that the
architecture definition satisfies it. We also published an extended
version as a technical report~\cite{UCAM-CL-TR-959}.
\end{itemize}
We have additionally released several technical reports, including this
document, describing our approach and prototypes.
Each has had multiple versions reflecting evolution of our approach:
\begin{itemize}
\item This report, the \citetitleit{UCAM-CL-TR-987}~\cite{UCAM-CL-TR-850, UCAM-CL-TR-864,
UCAM-CL-TR-876, UCAM-CL-TR-891, UCAM-CL-TR-907, UCAM-CL-TR-927,
UCAM-CL-TR-951, UCAM-CL-TR-987},
describes the CHERI ISA, both as a high-level, software-facing
model and the specific mapping into multiple instruction sets.
Successive versions have introduced improved C-language support, support for
scalable compartmentalization, and compressed capabilities.
\item The \citetitleit{UCAM-CL-TR-877}~\cite{UCAM-CL-TR-877} describes in greater detail our
mapping of software into instruction-set primitives in both the compiler and
operating system; earlier versions of the document were released as the
\textit{Capability Hardware Enhanced RISC Instructions: CHERI User's