Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient (unboxed) tagged unions #388

Open
mneumann opened this issue Sep 17, 2022 · 5 comments
Open

Efficient (unboxed) tagged unions #388

mneumann opened this issue Sep 17, 2022 · 5 comments
Labels
BLOCKED We can't or shouldn't work on this yet, for reasons explained in the ticket and/or comments. complexity 2: significant This is non-trivial, but still not a major undertaking kind: feature: language Adding a new language feature

Comments

@mneumann
Copy link
Contributor

  • Think of :enum with some payload.
  • At the moment it is unclear whether returning or yielding U64 | I64 | F64 | Pair(U64, U64) will be boxed or not.
  • Boxing would forbid these kind of types in performance critical code.
  • Either we have to trust the compiler to optimize accordingly or introduce a new type Union(A | B | C) that would enforce un-boxing.

Would be nice to write code like this:


:fun some_values
  :yields (U64 | None | F64)
  yield U64[1]
  yield None
  yield F64[1.3]

:fun caller
  @some_values -> (|value|
    case value <: (
    | U64 |  ...
    | None | ...
    | F64 | ...
    )
  )
@jemc
Copy link
Contributor

jemc commented Sep 17, 2022

I think something like this would need to ride on the back of the "fat pointers"-related changes that need to happen, as both relate to keeping type information as part of the use site of the runtime value (rather than in pointed-to memory).

In other words, I think what you're describing is mostly just a "fat value" (no pointer).

@jemc jemc added kind: feature: language Adding a new language feature complexity 2: significant This is non-trivial, but still not a major undertaking BLOCKED We can't or shouldn't work on this yet, for reasons explained in the ticket and/or comments. labels Sep 17, 2022
@mneumann
Copy link
Contributor Author

Not sure if this is related to fat-pointers...This could also be just another built-in type :union that behaves like a Rust :enum. A Savi union type (MsgPack.Nil | MsgPack.Bool | ...) would be anonymous, so one would have to wrap it in a struct to make it a newtype.

A valid use-case is MessagePack.Token (https://github.com/mneumann/savi-MessagePack/blob/main/src/MessagePack.Token.savi) which tries to simulate a tagged union but requires lots of boiler-plate code in order to do so. In Rust the same would look like:

enum MessagePackToken {
  Nil,
  Bool(bool),
  UInt(u64),
  Int(i64),
  Float(f64),
  BinaryAhead {len: u32},
  StringAhead {len: u32},
  ArrayAhead {nelems: u32},
  MapAhead {nitems: u32},
  ExtAhead {type: u8, len: u32}
}

And the sizeof MessagePackToken would be the size of the tag + sizeof the largest "case". This, combined with structural pattern matching is very powerful, but is less OOP.

@jemc
Copy link
Contributor

jemc commented Sep 19, 2022

The approach I'm thinking of that piggybacks on "fat pointers", could be thought of as "fat values", but the result is the same as what you wrote here:

And the [size] would be the size of the tag + sizeof the largest "case"

The difference is that with the approach I'm suggesting we should be able to use anonymous type unions (like U64 | I64 | F64 | Pair(U64, U64) in your initial post) and have those get quietly converted into a tagged union / fat value under the hood, without boxing.

The advantage is that you need not declare all the cases ahead of time in a :union declaration - instead you can create arbitrary ad hoc unions using | that contain whatever types you need for your particular purpose.

@mneumann
Copy link
Contributor Author

That sounds good! So If I want:

enum Expr {
  Add(u32, u32),
  Sub(u32, u32)
}

We would just write in Savi:

:struct Add
  :let a U32
  :let b U32
  :new (@a, @b)
// maybe one day this can be shortened to just:
:data Add(a U32, b U32)

:struct Sub
  :let a U32
  :let b U32
  :new (@a, @b)

:alias Expr = Add | Sub

And If I would want a newtype, I'd write instead of the alias:

:struct Expr
  :let inner (Add | Sub)
  :new(@inner)

@jemc
Copy link
Contributor

jemc commented Sep 19, 2022

Yes, that's the idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BLOCKED We can't or shouldn't work on this yet, for reasons explained in the ticket and/or comments. complexity 2: significant This is non-trivial, but still not a major undertaking kind: feature: language Adding a new language feature
Projects
None yet
Development

No branches or pull requests

2 participants