Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning when comparing sequences #1387

Open
6 tasks done
esbenbjerre opened this issue Oct 2, 2024 · 9 comments
Open
6 tasks done

Warning when comparing sequences #1387

esbenbjerre opened this issue Oct 2, 2024 · 9 comments

Comments

@esbenbjerre
Copy link

esbenbjerre commented Oct 2, 2024

I propose we issue a warning when comparing sequences.

As of now the behaviour of equals/comparison on sequences depends on use of constants and/or what the sequence is backed by.

seq { 1 .. 3 } = seq { 1 .. 3 } // false
seq [ 1; 2; 3 ] = seq [ 1; 2; 3 ] // true

Another example is Seq.groupBy that in some cases will produce unexpected results.

open System

let n = 10

type A = {
  I: int
  Xs: int seq
}

let someSeq =
  Seq.init n (fun i ->
    {
      I = i
      Xs =
        if i >= n / 2 then
          seq [ 1; 2; 3 ]
        else
          seq [ 3; 2; 1 ]
    })

// 2
let ok =
  someSeq
  |> Seq.groupBy (fun x -> x.Xs |> Seq.map id |> List.ofSeq)
  |> Seq.map (fun (_, xs) -> Seq.length xs)
  |> Seq.length

// 10
let unexpected =
  someSeq
  |> Seq.groupBy (fun x -> x.Xs |> Seq.map id)
  |> Seq.map (fun (_, xs) -> Seq.length xs)
  |> Seq.length

Pros and Cons

The advantage of making this adjustment to F# is that it would be more clear that equality between two sequences does not always use structural comparison.

A disadvantage of making this adjustment to F# is that the source of the warning may be confusing (as in the case of Seq.groupBy.

Extra information

Estimated cost (XS, S, M, L, XL, XXL): S

Affidavit (please submit!)

Please tick these items by placing a cross in the box:

  • This is not a question (e.g. like one you might ask on StackOverflow) and I have searched StackOverflow for discussions of this issue
  • This is a language change and not purely a tooling change (e.g. compiler bug, editor support, warning/error messages, new warning, non-breaking optimisation) belonging to the compiler and tooling repository
  • This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it
  • I have searched both open and closed suggestions on this site and believe this is not a duplicate

Please tick all that apply:

  • This is not a breaking change to the F# language design
  • I or my company would be willing to help implement and/or test this

For Readers

If you would like to see this issue implemented, please click the 👍 emoji on this issue. These counts are used to generally order the suggestions by engagement.

@esbenbjerre esbenbjerre changed the title Warning when comparing two seqs with equals operator Warning when comparing two seqs Oct 2, 2024
@esbenbjerre esbenbjerre changed the title Warning when comparing two seqs Warning when comparing seqs Oct 2, 2024
@esbenbjerre esbenbjerre changed the title Warning when comparing seqs Warning when comparing sequences Oct 2, 2024
@vzarytovskii
Copy link

Probably good case for analyzer

@charlesroddie
Copy link

What is the implementation of this? I thought that seq is a word for IEnumerable'<'T> and interfaces are equal if they are reference equal. If that is the case, an analyzer which warns when equality is reference equality would be very valuable and would include this.

However this does not explain why the following is true:

seq [ 1; 2; 3 ] = seq [ 1; 2; 3 ] // true

That is very surprising to me.

@brianrourkeboll
Copy link

However this does not explain why the following is true:

seq [ 1; 2; 3 ] = seq [ 1; 2; 3 ] // true

It's delegating to the concrete type's Equals implementation at runtime — in this case, that of FSharpList<'T>. The square brackets are creating F# lists, and the seq function is just upcasting them.

Here's a trivial example that doesn't involve the special dual-purpose `seq` value (which can be both a normal F# function and something like a computation expression builder).
type IFace = interface end

let iface x = x :> IFace

type A (x : int) =
    member _.X = x
    interface IFace

iface (A 3) = iface (A 3) // false

type B (x : int) =
    member _.X = x
    interface IFace
    override this.Equals obj = match obj with :? B as other -> this.X = other.X | _ -> false
    override this.GetHashCode () = this.X

iface (B 3) = iface (B 3) // true

Probably good case for analyzer

It would unfortunately probably be rather difficult to create a robust analyzer for this — you'd somehow need to track a value's concrete type from instantiation through to any usage that might call Equals or GetHashCode (e.g., where the equality constraint is applied). In the examples above, you'd need to:

  1. Know that the concrete type produced by seq { … } computation expressions and functions like Seq.init (or any others you cared about) do not implement structural equality.
  2. Track the values so produced through a potentially unbounded series of calls to invocations of =, Seq.groupBy, etc.

@Happypig375
Copy link
Contributor

Or rather have open StructuralComparison just like https://fsharp.github.io/fsharp-core-docs/reference/fsharp-core-operators-nonstructuralcomparison.html

which forces all operators to work with structural comparison semantics if collections are detected?

@Happypig375
Copy link
Contributor

One pitfall is infinite sequences; one call to the equality operator that considers all sequences might hang the program. It is possible to set a time limit (e.g. 0.1s) for testing sequence equality, but any limit is arbitrary.

@T-Gro
Copy link

T-Gro commented Jan 21, 2025

Re: Structural comparison for seq:
seq is about data as well as about computation. Apart from trivial cases, the correct content-based equality would need to enumerate the full contents. That can in the general case be a one-time operation only (incl. possible side effect such as DB reading), making this a bad option.

When one wants to do just that, it is better to be done explicitly or with a locally-provided overwrite:

let (==) (x:seq<_>) (y:seq<_>) = System.Linq.Enumerable.SequenceEqual(x,y)
let a = seq {1..3}
let b = seq {1;2;3}
printfn "%A" (a==b)

Re: Warning
A change that could be workable is an opt-in warning that would make seq<_> fail the equality constraint at F# compile-time.
It would not affect runtime behavior, and would not spot all usages of equality when C# libs are involved.
The motivating example with groupBy would be covered.

(i.e. treat seq similarily to a type that has NoEqualityAttribute on it)

@charlesroddie
Copy link

charlesroddie commented Jan 21, 2025

A change that could be workable is an opt-in warning that would make seq<_> fail the equality constraint at F# compile-time.

Can we make something more general/consistent out of this?

Applying this to all interfaces for example. (Not sure whether interfaces with some relevant IEquatable<'I> should be excluded.)

I am seeing a lot of things compile around equality which should at least warn:

type X() = class end
X() = X() // compiles and evaluates to false but very dangerous. A warning would force users to write explicitly `Object.ReferenceEquals` if that was really the intent.

type ITest = interface end
type Y =
    | YCase
    interface ITest
(YCase :> ITest) = (YCase :> ITest) // compiles and evaluates to true, but two `ITest`s should not be equatable.

@T-Gro
Copy link

T-Gro commented Jan 22, 2025

It would be great if common types with good structural equality were scanned first - if this warning is done, it should have good coverage for existing idioms (both in the framework as well as in generated code coming out of C# constructs - e.g. records).

For example, F# would emit something like X : IEquatable<X>, IStructuralEquatable - would that have a good recall if used as detection mechanism?

@charlesroddie
Copy link

It would be great if common types with good structural equality were scanned first

Of course a 'T which should satisfy equality (such as Y above) should continue to do so, only a interface (including the case where 'T satisfies the interface) (such as YCase :> I) should not.

A 'T where 'T :> IEquatable<'T> should satisfy equality (tracked in #1280) and that would include the (rare?) case where ``T` is an interface type. F# already emits this interface for DUs and records normally.

I wonder if something comprehensive needs to be written up in an equality RFC for 1280, including the current equality mechanism, adding IEquatable<'T> and possibly IStructuralEquatable, and removing/warning on some of the cases in this issue. I'm tied up for the next 1-2months but could contribute to this afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants