This provides allocators suitable for a number of use cases.
All of these allocators implement the traits std::alloc::GlobalAlloc
and std::alloc::Allocator
, as we as a common base trait, Allocator
.
The most useful is a global allocator which allows switching between thread, coroutine and global (and thuse lockable) memory allocators, using the macro global_thread_and_coroutine_switchable_allocator()
.
Allocators provided include:-
BumpAllocator
, a never-freeing bump allocator with slight optimization for reallocating the last allocation.BitSetAllocator
, an allocator that uses a bit set of free blocks; uses 64-bit chunks to optimize searches.MultipleBinarySearchTreeAllocator
, an efficient allocator which minimizes fragmentation by using multiple red-black trees of free blocks which are aggresively defragmented.ContextAllocator
, a choice of eitherBumpAllocator
,BitSetAllocator
orMultipleBinarySearchTreeAllocator
.MemoryMapAllocator
, a NUMA-aware mmap allocator with support for NUMA policies.GlobalThreadAndCoroutineSwitchableAllocator
, suitable for replacing the global allocator and provides switchable allocators for global, thread local and context (coroutine) local needs; must b created using the macroglobal_thread_and_coroutine_switchable_allocator
.
Allocators use a MemorySource
to obtain and release memory.
Memory sources provided include:-
MemoryMapSource
, useful for thread-local allocators as it can obtain memory from NUMA-local memory.ArenaMemorySource
, an arena of fixed blocks which is itself backed by a memory source; this is useful as a source for theBumpAllocator
andBitSetAllocator
when used for contexts.
Additionally a number of adaptors are provided:-
AllocatorAdaptor
, an adaptor ofAllocator
toGlobalAlloc
andAlloc
; use it by callingAllocator.adapt()
GlobalAllocToAllocatorAdaptor
, an adaptor ofGlobalAlloc
toAllocator
, useful for assigning a global allocator toGlobalThreadAndCoroutineSwitchableAllocator
.AllocToAllocatorAdaptor
, an adaptor ofAlloc
toAllocator
.
When using GlobalThreadAndCoroutineSwitchableAllocator
, it is possible to save and restore the allocator state for the currently running context (coroutine).
It is also possible to create a lockless, fast thread-local allocator which make use of NUMA memory, unlike a conventional malloc.
- Investigate wrapping Rampant Pixel's Memory Allocator.
- Investigate using DPDK's allocator.
- Investigate a B-tree backed allocator.
- Investigate a design that uses multiple doubly-linked 'free' lists of blocks; blocks can be variable in size but the free list is sorted
- Iteration over a particular free-list range may encountered blocks too small, or blocks so large they can be split up.
- This design is similar to that used by DPDK.
- To make the allocator multi-threaded, DPDK takes a spin lock on a particular 'heap', which is a set of free lists.
- Investigate a fall-back over-size allocator for a thread-local allocator, which could use the
NumaMemoryMapSource
underneath. - Investigate supporting over-size allocations in
MultipleBinarySearchTreeAllocator
by scanning the largest binary search tree for contiguous blocks. - Investigate a persistent-memory backed allocator.
- Properly support excess allocations and Alloc's grow_in_place functions, but only if these are used by downstream collections.
- Investigate the use of the
BMI1
intrinsics_blsi_u64
(extract lowest set bit),_blsmsk_u64
and_blsr_u64
.
The license for this project is MIT.