Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement FileDescriptor.memoryMap, FileDescriptor.memoryUnmap, and FileDescriptor.memorySync #68

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
177 changes: 177 additions & 0 deletions Sources/System/FileOperations.swift
Original file line number Diff line number Diff line change
Expand Up @@ -393,4 +393,181 @@ extension FileDescriptor {
}.map { _ in (.init(rawValue: fds.0), .init(rawValue: fds.1)) }
}
#endif

#if !os(Windows)
// MARK: Memory Mapping

/// Describes the desired memory protection of the
/// mapping (and must not conflict with the open mode of the file).
/// Flags can be the bitwise OR of one or more of each case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need some editing -- this is documenting a top-level struct, so it isn't obvious what "the mapping" or "the file" is. Additionally, bitwise or is not how these options are combined in Swift.

@frozen
public struct MemoryProtection: RawRepresentable, Hashable, Codable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be an OptionSet!

/// The raw C protection number.
@_alwaysEmitIntoClient
public let rawValue: CInt

/// Creates a strongly typed error number from a raw C error number.
@_alwaysEmitIntoClient
public init(rawValue: CInt) { self.rawValue = rawValue }

@_alwaysEmitIntoClient
private init(_ raw: CInt) { self.init(rawValue: raw) }

/// Pages may not be accessed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs should specify the original C names of these constants (and types, where appropriate), to help orientate people who are already familiar with them, and to aid searching.

Suggested change
/// Pages may not be accessed.
/// Pages may not be accessed.
///
/// The corresponding C constant is `PROT_NONE`.

(This applies to all of these.)

@_alwaysEmitIntoClient
public static var none: MemoryProtection { MemoryProtection(rawValue: _PROT_NONE) }
/// Pages may be read.
@_alwaysEmitIntoClient
public static var read: MemoryProtection { MemoryProtection(rawValue: _PROT_READ) }
/// Pages may be written.
@_alwaysEmitIntoClient
public static var write: MemoryProtection { MemoryProtection(rawValue: _PROT_WRITE) }
/// Pages may be executed.
@_alwaysEmitIntoClient
public static var executed: MemoryProtection { MemoryProtection(rawValue: _PROT_EXEC) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static var executed: MemoryProtection { MemoryProtection(rawValue: _PROT_EXEC) }
public static var execute: MemoryProtection { MemoryProtection(rawValue: _PROT_EXEC) }

}

/// Determines whether updates to the mapping are
/// visible to other processes mapping the same region, and whether
/// updates are carried through to the underlying file. This
/// behavior is determined by exactly one flag.
public struct MemoryMapKind: RawRepresentable, Hashable, Codable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POSIX calls these "flags"; we should follow the same terminology.

Suggested change
public struct MemoryMapKind: RawRepresentable, Hashable, Codable {
public struct MemoryMapFlags: RawRepresentable, Hashable, Codable {

Like MemoryProtection, this needs to also be an OptionSet.

/// The raw C flag number.
@_alwaysEmitIntoClient
public let rawValue: CInt

/// Creates a strongly typed error number from a raw C error number.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a revision.

@_alwaysEmitIntoClient
public init(rawValue: CInt) { self.rawValue = rawValue }

@_alwaysEmitIntoClient
private init(_ raw: CInt) { self.init(rawValue: raw) }

/// Share this mapping. Updates to the mapping are visible to
/// other processes mapping the same region, and (in the case
/// of file-backed mappings) are carried through to the
/// underlying file.
@_alwaysEmitIntoClient
public static var shared: MemoryMapKind { MemoryMapKind(rawValue: _MAP_SHARED) }
/// Create a private copy-on-write mapping. Updates to the
/// mapping are not visible to other processes mapping the
/// same file, and are not carried through to the underlying
/// file. It is unspecified whether changes made to the file
/// after the `memoryMap` call are visible in the mapped region.
@_alwaysEmitIntoClient
public static var `private`: MemoryMapKind { MemoryMapKind(rawValue: _MAP_PRIVATE) }

// TODO: There are several other MemoryMapKinds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should default to exposing every flag that is present in the underlying system headers. (This will be different on every supported platform.)

E.g., on Darwin, we have:

#define MAP_FIXED        0x0010 /* [MF|SHM] interpret addr exactly */
#define MAP_RENAME       0x0020 /* Sun: rename private pages to file */
#define MAP_NORESERVE    0x0040 /* Sun: don't reserve needed swap area */
#define MAP_RESERVED0080 0x0080 /* previously unimplemented MAP_INHERIT */
#define MAP_NOEXTEND     0x0100 /* for MAP_FILE, don't change file size */
#define MAP_HASSEMAPHORE 0x0200 /* region may contain semaphores */
#define MAP_NOCACHE      0x0400 /* don't cache pages for this mapping */
#define MAP_JIT          0x0800 /* Allocate a region that will be used for JIT purposes */
#define MAP_FILE        0x0000  /* map from file (default) */
#define MAP_ANON        0x1000  /* allocated from memory, swap space */
#define MAP_ANONYMOUS   MAP_ANON
#define MAP_RESILIENT_CODESIGN  0x2000 /* no code-signing failures */
#define MAP_RESILIENT_MEDIA     0x4000 /* no backing-store failures */
#define MAP_32BIT       0x8000          /* Return virtual addresses <4G only */
#define MAP_TRANSLATED_ALLOW_EXECUTE 0x20000 /* allow execute in translated processes */
#define MAP_UNIX03       0x40000 /* UNIX03 compliance */

}

/// Determines whether memory sync should be
/// synchronous, asynchronous.
@frozen
public struct MemorySyncKind: RawRepresentable, Hashable, Codable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like MemoryMapKind, I think this should be modeling the original (open-ended) flags. It ought to be called MemorySyncFlags, and it needs to be an OptionSet, not an enum-like struct.

/// The raw C flag number.
@_alwaysEmitIntoClient
public let rawValue: CInt

/// Creates a strongly typed error number from a raw C error number.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revise

@_alwaysEmitIntoClient
public init(rawValue: CInt) { self.rawValue = rawValue }

@_alwaysEmitIntoClient
private init(_ raw: CInt) { self.init(rawValue: raw) }

/// Requests an update and waits for it to complete.
@_alwaysEmitIntoClient
public static var synchronous: MemorySyncKind { MemorySyncKind(rawValue: _MS_SYNC) }
/// Specifies that an update be scheduled, but the call
/// returns immediately.
@_alwaysEmitIntoClient
public static var asynchronous: MemorySyncKind { MemorySyncKind(rawValue: _MS_ASYNC) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also want MS_INVALIDATE here.

On Darwin, we probably also want to expose MS_KILLPAGES and MS_DEACTIVATE.

}

/// Create a new mapping in the virtual address space of the
/// calling process.
/// After the `memoryMap` call has returned, the file descriptor can
/// be closed immediately without invalidating the mapping.
/// - Parameters:
/// - length: Specifies the length of the mapping (which must be greater than 0).
/// - pageOffset: The page offset to map. Page size is determined by `sysconf(_SC_PAGE_SIZE)`
/// - kind: Determines the kind of mapping returned. Currently limited to `MAP_SHARED` and `MAP_PRIVATE`.
/// - protection: Describes the desired memory protection of the mapping (and must not conflict with the open mode of the file).
/// - Returns: The new memory mapping.
@_alwaysEmitIntoClient
public func memoryMap(
length: Int, pageOffset: Int, kind: MemoryMapKind, protection: [MemoryProtection]
) throws -> UnsafeMutableRawPointer {
Comment on lines +498 to +500
Copy link
Member

@lorentey lorentey Nov 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs significant changes.

  • I think mapMemory would be a better name for this function.
  • MemoryProtection needs to be an OptionSet. The protection argument must not be an array.
  • The term pageOffset (and its documentation above) does not match what mmap's offset argument means -- it's an offset into the file object referenced by the file descriptor.
  • The type of the offset must be Int64, to match FileDescriptor.seek.
  • If we want to reorder arguments this way, then the offset ought to precede the length, not vice versa.
  • We must expose mmap's addr parameter (as an optional UnsafeMutableRawPointer, with a nil default).
  • The original mmap's fd parameter is optional. (MAP_ANONYMOUS mappings are very important, and they do not take a file descriptor.) This means that we must either move this method out of FileDescriptor, or we must introduce a second, top-level mapMemory that's dedicated to the anonymous case.
    Note that for anonymous maps, Darwin uses the fd argument as an extra information channel (see VM_FLAGS_PURGABLE, VM_MAKE_TAG()) -- this may be an argument in favor of spinning of a separate function for that use case.
  • mmap can technically be used to map memory at address zero, i.e., the null pointer, which isn't representable by an UnsafeMutableRawPointer. This is an extremely niche edge case, but I think it would make sense to change the return type to an optional to reflect this.

Here is one way to embrace these suggestions:

public struct MemoryProtection { ... }
public struct MemoryMapFlags { ... }

extension FileDescriptor {
  public func mapMemory(
    at address: UnsafeMutableRawPointer? = nil,
    from offset: Int64,
    length: Int,
    protection: MemoryProtection,
    flags: MemoryMapFlags,
  ) throws -> UnsafeMutableRawPointer?
}

public struct MachVMFlags: RawRepresentable {
  public var rawValue: CInt
  public init?(rawValue: CInt)
  public static var purgeable: Self { /* VM_FLAGS_PURGABLE */ }
  public static func tag(_ value: UInt8) -> Self
}

public func mapAnonymousMemory(
  at address: UnsafeMutableRawPointer? = nil,
  length: Int,
  protection: MemoryProtection,
  flags: MemoryMapFlags,
  machVMFlags: MachVMFlags? = nil
) throws -> UnsafeMutableRawPointer?

(If we go in this direction, then both public variants must forward to the same @usableFromInline implementation.)

Given that either MAP_SHARED or MAP_PRIVATE must always be present in the flags, it may make sense to model these as something other than mere flags -- perhaps going as far as splitting mapMemory into two variants, mapSharedMemory and mapPrivateMemory. Then again, this would mean that we'd also need to split the anonymous variant, and this treads into slightly dangerous territory -- we could end up painting ourselves into a combinatorial explosion in case a platform decides to introduce new flavors in addition to these two.

try _memoryMap(length: length,
pageOffset: pageOffset,
kind: kind,
protection: protection).get()
}

@usableFromInline
internal func _memoryMap(
length: Int, pageOffset: Int, kind: MemoryMapKind, protection: [MemoryProtection]
) throws -> Result<UnsafeMutableRawPointer, Errno> {
valueOrErrno(valueOnFail: _MAP_FAILED, retryOnInterrupt: false) {
system_mmap(self.rawValue, length, protection.reduce(into: Int32(), { partialResult, prot in
partialResult |= prot.rawValue
}), kind.rawValue, _COffT(pageOffset))
}
}

/// Deletes the mappings for the specified
/// mapping, and causes further references to addresses within
/// the range to generate invalid memory references. The region is
/// also automatically unmapped when the process is terminated. On
/// the other hand, closing the file descriptor does not unmap the
/// region.
/// - Parameters:
/// - memoryMap: The memory map to unmap
/// - length: Amount in bytes to unmap.
@_alwaysEmitIntoClient
public func memoryUnmap(memoryMap: UnsafeMutableRawPointer, length: Int) throws {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this belongs under FileDescriptor. If we don't introduce a namespace for these, then this needs to be a top-level function.

I think unmapMemory would be a better name for this. The pointer argument needs to be optional and I think it would make sense to label it at.

Suggested change
public func memoryUnmap(memoryMap: UnsafeMutableRawPointer, length: Int) throws {
public func unmapMemory(
at address: UnsafeMutableRawPointer?,
length: Int
) throws

_ = try _memoryUnmap(memoryMap: memoryMap, length: length).get()
}

@usableFromInline
internal func _memoryUnmap(memoryMap: UnsafeMutableRawPointer, length: Int) throws -> Result<CInt, Errno> {
valueOrErrno(retryOnInterrupt: false) {
system_munmap(memoryMap, length)
}
}

/// Flushes changes made to the in-core copy of a file that
/// was mapped into memory using `memoryMap` back to the filesystem.
/// Without use of this call, there is no guarantee that changes are
/// written back before `memoryUnmap` is called. To be more precise, the
/// part of the file that corresponds to the memory area at the start
/// of the map and having length of `length` is updated.
/// - Parameters:
/// - memoryMap: The memory map to sync.
/// - length: Length to update.
/// - kind: Should specify one of `MemorySyncKind.synchronous` or `MemorySyncKind.asynchronous`.
/// - invalidateOtherMappings: Asks to invalidate other mappings of the same file (so
/// that they can be updated with the fresh values just
/// written).
@_alwaysEmitIntoClient
public func memorySync(
memoryMap: UnsafeMutableRawPointer,
length: Int,
kind: MemorySyncKind,
invalidateOtherMappings: Bool = false
) throws {
Comment on lines +553 to +558
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this belongs under FileDescriptor. If we don't introduce a namespace for these, then this needs to be a top-level function.

I don't think we should split the MS_INVALIDATE flag out of the rest -- there are additional (system-specific) flags that we also need to model, and it makes sense to keep things simple/consistent.

Suggested change
public func memorySync(
memoryMap: UnsafeMutableRawPointer,
length: Int,
kind: MemorySyncKind,
invalidateOtherMappings: Bool = false
) throws {
public func syncMemory(
at address: UnsafeMutableRawPointer,
length: Int,
flags: MemorySyncFlags
) throws {

We can also have this function take an UnsafeMutableRawBuffer instead of the first two arguments. (I'm not sure if that would be more or less convenient though. If we take this route, we probably need to adjust the return value of the mmap wrappers accordingly, as well as the parameters of munmap etc.

_ = try _memorySync(memoryMap: memoryMap,
length: length,
flags: invalidateOtherMappings ? kind.rawValue & _MS_INVALIDATE : kind.rawValue)
.get()
}

@usableFromInline
internal func _memorySync(memoryMap: UnsafeMutableRawPointer, length: Int, flags: CInt) throws -> Result<CInt, Errno> {
valueOrErrno(retryOnInterrupt: false) {
system_msync(memoryMap, length, flags)
}
}
#endif
}

35 changes: 35 additions & 0 deletions Sources/System/Internals/Constants.swift
Original file line number Diff line number Diff line change
Expand Up @@ -441,6 +441,26 @@ internal var _ELAST: CInt { ELAST }

// MARK: File Operations

#if !os(Windows)
@_alwaysEmitIntoClient
internal var _MAP_SHARED: CInt { MAP_SHARED }

@_alwaysEmitIntoClient
internal var _MAP_PRIVATE: CInt { MAP_PRIVATE }

@_alwaysEmitIntoClient
internal var _MAP_FAILED: UnsafeMutableRawPointer { MAP_FAILED }

@_alwaysEmitIntoClient
internal var _MS_ASYNC: CInt { MS_ASYNC }

@_alwaysEmitIntoClient
internal var _MS_SYNC: CInt { MS_SYNC }

@_alwaysEmitIntoClient
internal var _MS_INVALIDATE: CInt { MS_INVALIDATE }
#endif

@_alwaysEmitIntoClient
internal var _O_RDONLY: CInt { O_RDONLY }

Expand Down Expand Up @@ -512,6 +532,21 @@ internal var _O_SYMLINK: CInt { O_SYMLINK }
internal var _O_CLOEXEC: CInt { O_CLOEXEC }
#endif

// MARK: Mmap Protection
#if !os(Windows)
@_alwaysEmitIntoClient
internal var _PROT_EXEC: CInt { PROT_EXEC }

@_alwaysEmitIntoClient
internal var _PROT_READ: CInt { PROT_READ }

@_alwaysEmitIntoClient
internal var _PROT_WRITE: CInt { PROT_WRITE }

@_alwaysEmitIntoClient
internal var _PROT_NONE: CInt { PROT_NONE }
#endif

@_alwaysEmitIntoClient
internal var _SEEK_SET: CInt { SEEK_SET }

Expand Down
23 changes: 15 additions & 8 deletions Sources/System/Internals/Mocking.swift
Original file line number Diff line number Diff line change
Expand Up @@ -144,11 +144,13 @@ private func originalSyscallName(_ function: String) -> String {
return String(function.dropFirst("system_".count).prefix { $0 != "(" })
}

private func mockImpl(
private func mockImpl<T>(
valueOnFail: T,
valueOnSuccess: T,
name: String,
path: UnsafePointer<CInterop.PlatformChar>?,
_ args: [AnyHashable]
) -> CInt {
) -> T {
precondition(mockingEnabled)
let origName = originalSyscallName(name)
guard let driver = currentMockingDriver else {
Expand All @@ -165,32 +167,37 @@ private func mockImpl(
case .none: break
case .always(let e):
system_errno = e
return -1
return valueOnFail
case .counted(let e, let count):
assert(count >= 1)
system_errno = e
driver.forceErrno = count > 1 ? .counted(errno: e, count: count-1) : .none
return -1
return valueOnFail
}

return 0
return valueOnSuccess
}

internal func _mock(
name: String = #function, path: UnsafePointer<CInterop.PlatformChar>? = nil, _ args: AnyHashable...
) -> CInt {
return mockImpl(name: name, path: path, args)
return mockImpl(valueOnFail: -1, valueOnSuccess: 0, name: name, path: path, args)
}
internal func _mockInt(
name: String = #function, path: UnsafePointer<CInterop.PlatformChar>? = nil, _ args: AnyHashable...
) -> Int {
Int(mockImpl(name: name, path: path, args))
Int(mockImpl(valueOnFail: -1, valueOnSuccess: 0, name: name, path: path, args))
}

internal func _mockOffT(
name: String = #function, path: UnsafePointer<CInterop.PlatformChar>? = nil, _ args: AnyHashable...
) -> _COffT {
_COffT(mockImpl(name: name, path: path, args))
_COffT(mockImpl(valueOnFail: -1, valueOnSuccess: 0, name: name, path: path, args))
}
internal func _mock<T>(
valueOnFail: T, valueOnSuccess: T, name: String = #function, path: UnsafePointer<CInterop.PlatformChar>? = nil, _ args: AnyHashable...
) -> T {
mockImpl(valueOnFail: valueOnFail, valueOnSuccess: valueOnSuccess, name: name, path: path, args)
}
#endif // ENABLE_MOCKING

Expand Down
33 changes: 33 additions & 0 deletions Sources/System/Internals/Syscalls.swift
Original file line number Diff line number Diff line change
Expand Up @@ -123,3 +123,36 @@ internal func system_pipe(_ fds: UnsafeMutablePointer<Int32>) -> CInt {
return pipe(fds)
}
#endif

#if !os(Windows)
internal func system_mmap(_ fd: Int32, _ length: Int, _ prot: Int32, _ flags: Int32, _ offset: off_t) -> UnsafeMutableRawPointer {
#if ENABLE_MOCKING
if mockingEnabled {
let ptr = UnsafeMutableRawPointer.allocate(byteCount: 0, alignment: 0)
defer { ptr.deallocate() }
return _mock(valueOnFail: MAP_FAILED,
valueOnSuccess: ptr,
fd, length, prot, flags, offset)
}
#endif
return mmap(nil, length, prot, flags, fd, off_t(offset * Int64(sysconf(_SC_PAGESIZE))))
}

internal func system_munmap(_ addr: UnsafeMutableRawPointer, _ length: Int) -> CInt {
#if ENABLE_MOCKING
if mockingEnabled {
return _mock(addr, length)
}
#endif
return munmap(addr, length)
}

internal func system_msync(_ addr: UnsafeMutableRawPointer, _ length: Int, _ flags: Int32) -> CInt {
#if ENABLE_MOCKING
if mockingEnabled {
return _mock(addr, length, flags)
}
#endif
return msync(addr, length, flags)
}
#endif
20 changes: 20 additions & 0 deletions Sources/System/Util.swift
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ private func valueOrErrno<I: FixedWidthInteger>(
) -> Result<I, Errno> {
i == -1 ? .failure(Errno.current) : .success(i)
}
private func valueOrErrno<T: Equatable>(
valueOnFail: T, _ i: T
) -> Result<T, Errno> {
i == valueOnFail ? .failure(Errno.current) : .success(i)
}

// @available(macOS 10.16, iOS 14.0, watchOS 7.0, tvOS 14.0, *)
private func nothingOrErrno<I: FixedWidthInteger>(
Expand All @@ -36,6 +41,21 @@ internal func valueOrErrno<I: FixedWidthInteger>(
} while true
}

internal func valueOrErrno<T: Equatable>(
valueOnFail: T,
retryOnInterrupt: Bool,
_ f: () -> T
) -> Result<T, Errno> {
repeat {
switch valueOrErrno(valueOnFail: valueOnFail, f()) {
case .success(let r): return .success(r)
case .failure(let err):
guard retryOnInterrupt && err == .interrupted else { return .failure(err) }
break
}
} while true
}

// @available(macOS 10.16, iOS 14.0, watchOS 7.0, tvOS 14.0, *)
internal func nothingOrErrno<I: FixedWidthInteger>(
retryOnInterrupt: Bool, _ f: () -> I
Expand Down
Loading