-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Circular References #204
Comments
IIRC, we improved the error message for self-referential zero tangent types in #144. Is that fundamentally different to circular references? I am asking because the error handling mechanism didn't work for circular references. |
You are correct that we improved the error messages -- if you take a look at the example in this issue, you'll see that the error message is the improved one. The stack overflow e.g. here seems to have circumvented it though, I think it must be because it doesn't go via Self-referential vs circular references is a mistake on my part -- I should have been calling them circular references the whole time. |
Thanks for the clarification. The solution in #144 indeed does not seem to be general enough and deserves improvements. |
We would just have to ensure that the error that gets generated via |
A summary on the progress of this issue: After #210 and #228. This problem in this issue should be fixed. But I am not sure we want to close the issue yet, because it is still remained to adapt the testing pipeline: many functions still assume circular reference is illegal, so we can't properly test the resolution of this issue yet (ref #228 (comment)). |
Agreed -- loads of progress has been made, but we still need to get circular reference handling fully integrated in the test suite. |
I've re-added the high-priority label because @mhauru has pointed out that recent non-breaking changes has exposed some bits of the tangent interface which do not yet support circular references to users. In particular, I'll make a start on addressing this after my talk on Wednesday. |
This is blocked by JuliaLang/julia#56775 |
Properly writing this up is motivated by ongoing problems with circular references, highlighted in a PR linked to by #197 . These circular references appear in the testing infrastructure for Turing.jl -- while they could in principle be removed, it's inconvenient to do so, and Tapir.jl ought to be able to handle them.
The Problem
You can construct circular references via seemingly straightforward types as follows:
Tapir.jl can handle these if they appear inside functions that are being differentiated. For example,
However, the same is not true if they occur as an argument, or a value returned from, a function being differentiated:
The problem is that neither
zero_tangent
norrandn_tangent
account for the possibility of circular references.A Solution
Modify
zero_tangent
andrandn_tangent
to know about, and correctly deal with, circular references.The simplest solution is to keep track of all memory addresses / objects with fixed memory addresses which have been allocated during the current call to
zero_tangent
orrandn_tangent
, and to avoid generating a new tangent if one already exists for the memory address associated to a primal.For example, a sketch implementation for
zero_tangent
This strategy mirors exactly the strategy of
deepcopy
-- see its docstring andBase.deepcopy_internal
. A similar strategy is (I believe) used by Enzyme.jl.A Side-Benefit: Aliasing
This will also ensure that if two arguments to a function alias each other, such as
that provided we call
zero_tangent
orrandn_tangent
once for all arguments, the correct result will emerge.Performance Concerns
This additional work necessarily has some overhead associated to it in general. However, for bits types, there is no risk of circular referencing. For such types, the checks are not necessary, and the
IdDict
allocation can be avoided. In this case, the performance will be identical to how it currently is.Moreover, the above
IdDict
implementation is naive. We should just usepointer_from_objref
to obtain aPtr{Nothing}
which points to the address associated to a given object. We could therefore use aDict{Ptr{Nothing}, Any}
to store the tangents.Furthermore, when querying an element from such a
Dict
, we can assert that the value returned is of typetangent_type(P)
, whereP
is the primal type, thus avoiding propagating any type instabilities.Types which refer to themselves are a different problem
This problem is distinct from that associated to types such as
The text was updated successfully, but these errors were encountered: