Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can allocations have gaps? Can they be partially deallocated? Can they grow "in-place"? #430

Open
RalfJung opened this issue Jul 11, 2023 · 15 comments

Comments

@RalfJung
Copy link
Member

RalfJung commented Jul 11, 2023

Rust itself can only create contiguous allocations that can only be deallocated all at once. But does the compiler get to assume that all allocations, including those generated by 3rd party allocators, behave like that?

This does have impact on optimizations:

  • for a ptr: *const u8, if we see ptr.read() and ptr.wrapping_offset(x).read(), can we conclude that all memory at offset 0..x is dereferenceable?
  • if we have a x: &Cell<T>, then call unknown_fn() (which might deallocate the memory x points to), and then at least one byte of x is read again -- can we conclude that all of x (0..size_of::<T>()) is still dereferenceable?

One major example of an API that might violate this is mmap. There is a fairly clean way to model mmap: we say there is one (exposed) provenance (one allocation) for all mmap'ed memory, and with mmap/munmap ranges of memory become dereferenceable with that provenance. But clearly the "allocation" that represents all mmaped memory can have gaps, and it can be partially deallocated! If we say that the compiler can assume that allocations never have gaps, then the following is UB:

  • assume that no memory is mapped at X .. (X+4*PAGE_SIZE)
  • map a page at address X
  • map another page at address X+2*PAGE_SIZE
  • call the following function with X as *const u8:
fn break(ptr: *const u8) {
  ptr.read();
  ptr.wrapping_add(2*PAGE_SIZE).read();
}

This code must be UB because under the "no gap" assumption, the compiler is allowed to insert another read at ptr.add(PAGE_SIZE), but that address is not mapped so we'd get a pagefault.

So this leads to some questions:

  • Is the sequence of actions described above permitted? IMO the answer must be yes; making this UB is just too much of a footgun. Where does the UB even come from? wrapping_add is safe, so somehow the 2nd read is UB? But that memory was mmap'ed!
  • If that sequence of actions is permitted -- I think that implies we do not have a "no gap" assumption. At least we certainly don't get the optimizations we'd usually get from that assumption, as the example above shows.
  • What's LLVM's stance on this? Does LLVM do anything that assumes "no gaps"? Should we propose to add something to the LangRef that explicitly says gaps can happen?

Note that "mmap works in addresses, there is no provenance" does not help: in break, ptr has some provenance, and no matter where it comes from -- if we say "no gaps", no one provenance can be valid for both of these reads with an unmapped page in between them.

For the "no partial deallocation" assumption, the situation is similar -- we can also write mmap-based code that violates it:

  • map a page at address X, and another one at X+PAGE_SIZE
  • call the following function with X as *const u8:
fn break2(ptr: *const u8) {
  let _val = ptr.cast::<[u8; 2*PAGE_SIZE]>().read();
  deallocate_second_page(); // unmap the page at X+PAGE_SIZE
  let _val = ptr.read();
}

Is the compiler allowed to insert a speculative ptr.cast::<[u8; 2*PAGE_SIZE]>().read() at the end? If yes, this would segfault, so somehow the original code must have UB. I don't think it can reasonably have UB, and therefore we have to accept that partial deallocation is a thing.

For allocations growing in-place, if the compiler sees the malloc it might exploit that the geometry of the allocation cannot change, so even after calling unknown code an access outside the of the allocation per its initial size could be optimized away as UB -- but if the unknown code was actually growing the allocation to make the access in-bounds, that would be a wrong optimization.

@CAD97
Copy link

CAD97 commented Jul 11, 2023

If [the OP] sequence of actions is permitted -- I think that implies we do not have a "no gap" assumption. At least we certainly don't get the optimizations we'd usually get from that assumption, as the example above shows.

There is a really poor alternative way of permitting this while maintaining "no gap" allocations is to say that an attempt to access an out-of-bounds pointer (sometimes?) attempts a from_exposed_addr at the new address. However, this still has the same result of preventing any facts that would be derived from a "no gap" requirement.

It's at least somewhat interesting to observe that the TB retagging rules allow "gapped" tag-provenance-validity even if allocations cannot be gapped. Not particularly relevant, but perhaps still an interesting analog.

we have to accept that partial deallocation is a thing.

For pointers yes, it seems true, but I would like to express that both gapped allocation and partial deallocation imo shouldn't apply to references to !Freeze, despite allowing deallocating the object while references are still "protected" (because such references don't get protected). This should mean only raw pointers lose out on optimizations and it's possible (though annoying) to get "gapless" semantics by creating references.

I don't know how to encode that into the AM, though, since TB currently just treats reference-to-!Freeze like a raw pointer, IIUC.

@RalfJung
Copy link
Member Author

There is a really poor alternative way of permitting this while maintaining "no gap" allocations is to say that an attempt to access an out-of-bounds pointer (sometimes?) attempts a from_exposed_addr at the new address. However, this still has the same result of preventing any facts that would be derived from a "no gap" requirement.

That's actually kind of how Miri implements from_exposed_addr semantics.

But as you say in terms of optimizations this at least implies the effects of having gaps and permitting partial deallocation.

It's at least somewhat interesting to observe that the TB retagging rules allow "gapped" tag-provenance-validity even if allocations cannot be gapped. Not particularly relevant, but perhaps still an interesting analog.

This is true for both SB and TB.

For pointers yes, it seems true, but I would like to express that both gapped allocation and partial deallocation imo shouldn't apply to references to !Freeze, despite allowing deallocating the object while references are still "protected" (because such references don't get protected). This should mean only raw pointers lose out on optimizations and it's possible (though annoying) to get "gapless" semantics by creating references.

I don't understand what you mean by this. References have to be dereferenceable (on fn entry at least), so the part covered by T cannot have any gaps. But if we say partial deallocation is a thing then I'd rather not carve out some very magic special case where an &UnsafeCell<T> can be wholly but not partially deallocated... I don't even know a clean way to formalize such a special case.

For &(i32, Cell<i32>) if we have precise UnsafeCell tracking we would get that the first 4 bytes cannot be deallocated while the function runs, but the remaining 4 can.

I don't know how to encode that into the AM, though, since TB currently just treats reference-to-!Freeze like a raw pointer, IIUC.

It could easily retag them with a new "aliased" state, so this is not the problem. (Unless you mean disallowing partial deallocation of &UnsafeCell<T>, that I don't know how to do and I also don't think we want to do.)

@CAD97
Copy link

CAD97 commented Jul 16, 2023

(Unless you mean disallowing partial deallocation of &UnsafeCell<T>, that I don't know how to do and I also don't think we want to do.)

That is indeed what I was referring to. It'd essentially need to be an "aliased" protector which makes partial deallocation UB.

But having thought about it a bit more, I can't think of any transforms which it would actually matter for (well, at least if we assume "retagging" asserts it's still allocated; is that the case? even though LLVM can't use "dereferencable_on_entry"), so the extra validity semantics aren't beneficial for the cost.

@RalfJung
Copy link
Member Author

if we assume "retagging" asserts it's still allocated; is that the case?

That is definitely the intent.

@RalfJung
Copy link
Member Author

@RalfJung
Copy link
Member Author

That pattern of merging multiple mmap into one logical allocation would probably also be a problem on CHERI.

@CAD97
Copy link

CAD97 commented Jul 26, 2023

#433 made me think about this again.

While I was originally thinking just about &Cell<T> (I'm partial to using Freeze instead of precise cell tracking), there is a direct benefit of not allowing partial deallocation of &(T, Cell<T>) — enabling LLVM dereferencable again, since total deallocation isn't permitted.

I don't want to conflate the separate issues too much, but I am admittedly quite confused at trying to rationalize why my gut instinct is that noalias should ask "any byte is UnsafeCell" but dereferencable should ask "any byte is neither UnsafeCell nor padding;" I guess my gut sees some sort of polarity difference between "permitted to mutate" and "permitted to deallocate"?

@RalfJung
Copy link
Member Author

RalfJung commented Aug 28, 2023

One pattern that I think we definitely have to support (on targets where it works, i.e., not CHERI) is mapping some pages with mmap, and then mapping more pages at the end of that range with another mmap call, and then treating the entire thing like a single large allocation. That seems like a completely legit way to have a growing in-place buffer. So I would say that allocations can definitely be growing after they were initially created.

Given that, it would seem strangely asymmetric to me if they couldn't also shrink.

It would also be asymmetric to only allow growing/shrinking at the end, so this should probably work in both directions.

The one case that does seem meaningfully different is actually creating a gap in an allocation. I am less sure that people would be expecting that to work, and it is easier to come up with realistic optimizations that would break when there are gaps.

That is indeed what I was referring to. It'd essentially need to be an "aliased" protector which makes partial deallocation UB.

If partial deallocation is a thing, I am strongly opposed to treating full deallocation any different from "partial deallocation where there just happens to be nothing left". I cannot see a justification for introducing such a surprising special case in the semantics.

@RalfJung RalfJung changed the title Can allocations have gaps? Can they be partially deallocated? Can allocations have gaps? Can they be partially deallocated? Can they grow "in-place"? Aug 28, 2023
@digama0
Copy link

digama0 commented Aug 28, 2023

One pattern that I think we definitely have to support (on targets where it works, i.e., not CHERI) is mapping some pages with mmap, and then mapping more pages at the end of that range with another mmap call, and then treating the entire thing like a single large allocation. That seems like a completely legit way to have a growing in-place buffer. So I would say that allocations can definitely be growing after they were initially created.

This actually sounds quite a bit harder than shrinking allocations, because I'm pretty sure there are optimizations that assume this can't happen. Certainly C++ doesn't allow this, and I suspect there are some LLVM passes that bake in the assumption that the bounds of an allocation can't change over its lifetime.

One version that has come up before in regular rust is that a merge(&[T], &[T]) -> &[T] which concatenates adjacent slices has been discouraged as being UB, although maybe this is fine under TB? Subobject provenance is what causes problems for this one.

@chorman0773
Copy link
Contributor

realloc does exist, though it invalidates previous pointers.

@thomcc
Copy link
Member

thomcc commented Aug 28, 2023

It's likely we'll want a try_resize that adjusts an allocation in-place without invalidating or copying. This would solve a number of issues with the allocator API, since in many cases it will be a no-op (consider using try_resize to adjust alignment from 4 to 2 -- in almost all allocators it will be a no-op, and this would allow reusing an allocation for small layout mismatches). It also makes excess unnecessary, at least in allocators that can implement it (which admittedly is not all of them).

So I would prefer if such an API is semantically possible in the language.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 29, 2023

@digama0

This actually sounds quite a bit harder than shrinking allocations, because I'm pretty sure there are optimizations that assume this can't happen. Certainly C++ doesn't allow this, and I suspect there are some LLVM passes that bake in the assumption that the bounds of an allocation can't change over its lifetime.

If they cannot change, then they cannot shrink either. So why would growing be harder than shrinking?

Do you have an example of such as pass?

One version that has come up before in regular rust is that a merge(&[T], &[T]) -> &[T] which concatenates adjacent slices has been discouraged as being UB, although maybe this is fine under TB? Subobject provenance is what causes problems for this one.

This is UB even without SB/TB because you end up doing cross-allocation arithmetic when indexing into the new slice. I don't see how allocation bounds changing or staying the same is even related to this point.

@chorman0773

realloc does exist, though it invalidates previous pointers.

Realloc is completely different, it logically

  • makes a full copy of the allocation content
  • removes the old allocation
  • creates a new allocation with the old contents, possibly at the same address as the old one, but definitely with a new allocation ID

So no bounds of existing allocations change with realloc.

@thomcc

So I would prefer if such an API is semantically possible in the language.

Yeah that's like a nicely abstracted version of adding or removing pages at the end of a mapping while treating it all as a single allocation. I'd also like to support such an API.

Changing only the alignment is very unlikely to cause any problems for anything I think. It's changing the size where things start to become a little dicey.

@RalfJung
Copy link
Member Author

So I would prefer if such an API is semantically possible in the language.

There is however the one concern that such an API will likely not be implementable on CHERI.
Though I guess as long as it is a "try"-style API that requires users to handle the case where growing/shrinking in-place was not possible, the CHERI implementation can just always fail, and so at least code will still remain portable to CHERI, just less efficient. Which seems reasonable.

@thomcc
Copy link
Member

thomcc commented Aug 29, 2023

On CHERI it would still be possible to implement in cases where the underlying operation is a noop -- For example, in many allocators this would be true for realigning where the start and end alignment are both less than alignof(maxalign_t), or resizing while the start and end size are still within the same size class (or equivalent).

On CHERI it wouldn't be possible to implement this using mmap-style operations, but yeah, it would be legal for an allocator to implement try_resize as always failing, so users should handle the failure regardless.

Note that I'm not sure implementing it with mmap operations would be that useful anyway, although you're correct that it is possible.

@RalfJung
Copy link
Member Author

RalfJung commented Sep 7, 2023

According to this, LLVM has a fixed assumption that allocations never change their size. So if Rust wants to allow something like that, a bunch of work on LLVM would have to be done first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants