Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Move UnionBuilder tests to Markdown #15374

Merged
merged 2 commits into from
Jan 9, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions crates/red_knot_python_semantic/resources/mdtest/union_types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Union types

This test suite covers certain basic properties and simplification strategies for union types.

## Basic unions

```py
from typing import Literal

def _(u1: int | str, u2: Literal[0] | Literal[1]) -> None:
reveal_type(u1) # revealed: int | str
reveal_type(u2) # revealed: Literal[0, 1]
```

## Duplicate elements are collapsed

```py
def _(u1: int | int | str, u2: int | str | int) -> None:
reveal_type(u1) # revealed: int | str
reveal_type(u2) # revealed: int | str
```

## `Never` is removed
sharkdp marked this conversation as resolved.
Show resolved Hide resolved

`Never` is an empty set, a type with no inhabitants. Its presence in a union is always redundant,
and so we eagerly simplify it away. `NoReturn` is equivalent to `Never`.

```py
from typing_extensions import Never, NoReturn

def never(u1: int | Never, u2: int | Never | str) -> None:
reveal_type(u1) # revealed: int
reveal_type(u2) # revealed: int | str

def noreturn(u1: int | NoReturn, u2: int | NoReturn | str) -> None:
reveal_type(u1) # revealed: int
reveal_type(u2) # revealed: int | str
```

## Flattening of nested unions

```py
from typing import Literal

def _(
u1: (int | str) | bytes,
u2: int | (str | bytes),
u3: int | (str | (bytes | complex)),
) -> None:
reveal_type(u1) # revealed: int | str | bytes
reveal_type(u2) # revealed: int | str | bytes
reveal_type(u3) # revealed: int | str | bytes | complex
```

## Simplification using subtyping

The type `S | T` can be simplified to `T` if `S` is a subtype of `T`:

```py
from typing_extensions import Literal, LiteralString

def _(
u1: str | LiteralString, u2: LiteralString | str, u3: Literal["a"] | str | LiteralString, u4: str | bytes | LiteralString
) -> None:
reveal_type(u1) # revealed: str
reveal_type(u2) # revealed: str
reveal_type(u3) # revealed: str
reveal_type(u4) # revealed: str | bytes
Comment on lines +62 to +68
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests would be more readable if we had something like C++'s std::declval<T>() (#13789) which pretends to create a value of type T. Using knot_extensions, we could now easily implement a function like the following (using PEP 747 language):

def value_of[T](typ: TypeForm[T]) -> T: ...

that would allow us to write these tests in a single line without having to introduce dummy parameters and without having the definitions a few lines away from the assertions:

Suggested change
def _(
u1: str | LiteralString, u2: LiteralString | str, u3: Literal["a"] | str | LiteralString, u4: str | bytes | LiteralString
) -> None:
reveal_type(u1) # revealed: str
reveal_type(u2) # revealed: str
reveal_type(u3) # revealed: str
reveal_type(u4) # revealed: str | bytes
reveal_type(value_of(str | LiteralString)) # revealed: str
reveal_type(value_of(LiteralString | str)) # revealed: str
reveal_type(value_of(Literal["a"] | str | LiteralString)) # revealed: str
reveal_type(value_of(str | bytes | LiteralString)) # revealed: str | bytes

A disadvantage of this approach would be that it makes more of these tests dependent on knot_extensions. @AlexWaygood is still "a little sceptical" about this idea, so I kept the usual strategy of using function parameters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, typing.cast (with a dummy second argument), is effectively the value_of function. I wouldn't be opposed to adding value_of as a variant that doesn't require the dummy second argument, at the same time we add typing.cast. But I don't feel strongly either way, I think the function arguments are reasonable too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A disadvantage of this approach would be that it makes more of these tests dependent on knot_extensions.

Also, could consider reveal_type_of_value(...) function which has same logic of reveal_type that doesn't need to explicitly import?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could consider reveal_type_of_value(...) function which has same logic of reveal_type that doesn't need to explicitly import?

More like reveal_type_of_type_expression or something like that, but yes — that's a good idea! We could take it one step further and implement the analog to assert_type on that level. Basically assert_equal_type(type1, type2).

And now that I write it out, I realize that we sort-of already have this in knot_extensions. We can write

static_assert(is_equivalent_to(str | LiteralString, str))

Now that only works for fully static types, but once we have is_gradual_equivalent_to, that would be a (slightly more verbose) alternative to assert_equal_type:

static_assert(is_gradual_equivalent_to(Unknown | Unknown | str, Unknown | str))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, typing.cast (with a dummy second argument), is effectively the value_of function.

"It's like typing.cast, just without having a value in the first place!" is exactly how I pitched it to Alex, and he still didn't want it 😄

```

## Boolean literals

The union `Literal[True] | Literal[False]` is exactly equivalent to `bool`:

```py
from typing import Literal

def _(
u1: Literal[True, False],
u2: bool | Literal[True],
u3: Literal[True] | bool,
u4: Literal[True] | Literal[True, 17],
u5: Literal[True, False, True, 17],
) -> None:
reveal_type(u1) # revealed: bool
reveal_type(u2) # revealed: bool
reveal_type(u3) # revealed: bool
reveal_type(u4) # revealed: Literal[True, 17]
reveal_type(u5) # revealed: bool | Literal[17]
```

## Do not erase `Unknown`

```py
from knot_extensions import Unknown

def _(u1: Unknown | str, u2: str | Unknown) -> None:
reveal_type(u1) # revealed: Unknown | str
reveal_type(u2) # revealed: str | Unknown
```

## Collapse multiple `Unknown`s
sharkdp marked this conversation as resolved.
Show resolved Hide resolved

Since `Unknown` is a gradual type, it is not a subtype of anything, but multiple `Unknown`s in a
union are still redundant:

```py
from knot_extensions import Unknown

def _(u1: Unknown | Unknown | str, u2: Unknown | str | Unknown, u3: str | Unknown | Unknown) -> None:
reveal_type(u1) # revealed: Unknown | str
reveal_type(u2) # revealed: Unknown | str
reveal_type(u3) # revealed: str | Unknown
```

## Subsume multiple elements

Simplifications still apply when `Unknown` is present.

```py
from knot_extensions import Unknown

def _(u1: str | Unknown | int | object):
reveal_type(u1) # revealed: Unknown | object
```
103 changes: 6 additions & 97 deletions crates/red_knot_python_semantic/src/types/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -396,119 +396,28 @@ mod tests {
use test_case::test_case;

#[test]
fn build_union() {
let db = setup_db();
let t0 = Type::IntLiteral(0);
let t1 = Type::IntLiteral(1);
let union = UnionType::from_elements(&db, [t0, t1]).expect_union();

assert_eq!(union.elements(&db), &[t0, t1]);
}

#[test]
fn build_union_single() {
let db = setup_db();
let t0 = Type::IntLiteral(0);
let ty = UnionType::from_elements(&db, [t0]);
assert_eq!(ty, t0);
}

#[test]
fn build_union_empty() {
fn build_union_no_elements() {
let db = setup_db();
let ty = UnionBuilder::new(&db).build();
assert_eq!(ty, Type::Never);
}

#[test]
fn build_union_never() {
fn build_union_single_element() {
let db = setup_db();
let t0 = Type::IntLiteral(0);
let ty = UnionType::from_elements(&db, [t0, Type::Never]);
let ty = UnionType::from_elements(&db, [t0]);
assert_eq!(ty, t0);
}

#[test]
fn build_union_bool() {
let db = setup_db();
let bool_instance_ty = KnownClass::Bool.to_instance(&db);

let t0 = Type::BooleanLiteral(true);
let t1 = Type::BooleanLiteral(true);
let t2 = Type::BooleanLiteral(false);
let t3 = Type::IntLiteral(17);

let union = UnionType::from_elements(&db, [t0, t1, t3]).expect_union();
assert_eq!(union.elements(&db), &[t0, t3]);

let union = UnionType::from_elements(&db, [t0, t1, t2, t3]).expect_union();
assert_eq!(union.elements(&db), &[bool_instance_ty, t3]);

let result_ty = UnionType::from_elements(&db, [bool_instance_ty, t0]);
assert_eq!(result_ty, bool_instance_ty);

let result_ty = UnionType::from_elements(&db, [t0, bool_instance_ty]);
assert_eq!(result_ty, bool_instance_ty);
}

#[test]
fn build_union_flatten() {
fn build_union_two_elements() {
let db = setup_db();
let t0 = Type::IntLiteral(0);
let t1 = Type::IntLiteral(1);
let t2 = Type::IntLiteral(2);
let u1 = UnionType::from_elements(&db, [t0, t1]);
let union = UnionType::from_elements(&db, [u1, t2]).expect_union();

assert_eq!(union.elements(&db), &[t0, t1, t2]);
}

#[test]
fn build_union_simplify_subtype() {
let db = setup_db();
let t0 = KnownClass::Str.to_instance(&db);
let t1 = Type::LiteralString;
let u0 = UnionType::from_elements(&db, [t0, t1]);
let u1 = UnionType::from_elements(&db, [t1, t0]);

assert_eq!(u0, t0);
assert_eq!(u1, t0);
}

#[test]
fn build_union_no_simplify_unknown() {
let db = setup_db();
let t0 = KnownClass::Str.to_instance(&db);
let t1 = Type::Unknown;
let u0 = UnionType::from_elements(&db, [t0, t1]);
let u1 = UnionType::from_elements(&db, [t1, t0]);

assert_eq!(u0.expect_union().elements(&db), &[t0, t1]);
assert_eq!(u1.expect_union().elements(&db), &[t1, t0]);
}

#[test]
fn build_union_simplify_multiple_unknown() {
let db = setup_db();
let t0 = KnownClass::Str.to_instance(&db);
let t1 = Type::Unknown;

let u = UnionType::from_elements(&db, [t0, t1, t1]);

assert_eq!(u.expect_union().elements(&db), &[t0, t1]);
}

#[test]
fn build_union_subsume_multiple() {
let db = setup_db();
let str_ty = KnownClass::Str.to_instance(&db);
let int_ty = KnownClass::Int.to_instance(&db);
let object_ty = KnownClass::Object.to_instance(&db);
let unknown_ty = Type::Unknown;

let u0 = UnionType::from_elements(&db, [str_ty, unknown_ty, int_ty, object_ty]);
let union = UnionType::from_elements(&db, [t0, t1]).expect_union();

assert_eq!(u0.expect_union().elements(&db), &[unknown_ty, object_ty]);
assert_eq!(union.elements(&db), &[t0, t1]);
}

impl<'db> IntersectionType<'db> {
Expand Down
Loading