Skip to content

Commit

Permalink
build based on 1152c08
Browse files Browse the repository at this point in the history
  • Loading branch information
Documenter.jl committed Jan 9, 2025
1 parent 8ecec65 commit 7b5067d
Show file tree
Hide file tree
Showing 13 changed files with 52 additions and 52 deletions.
2 changes: 1 addition & 1 deletion previews/PR435/.documenter-siteinfo.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2025-01-09T08:53:48","documenter_version":"1.8.0"}}
{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2025-01-09T10:27:48","documenter_version":"1.8.0"}}

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion previews/PR435/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion previews/PR435/known_limitations/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,4 @@
Mooncake.value_and_gradient!!(rule, foo, [5.0, 4.0])

# output
(4.0, (NoTangent(), [0.0, 1.0]))</code></pre><p><em><strong>The Solution</strong></em></p><p>This is only really a problem for tangent / fdata / rdata generation functionality, such as <code>zero_tangent</code>. As a work-around, AD testing functionality permits users to pass in <code>CoDual</code>s. So if you are testing something involving a pointer, you will need to construct its tangent yourself, and pass a <code>CoDual</code> to e.g. <code>Mooncake.TestUtils.test_rule</code>.</p><p>While pointers tend to be a low-level implementation detail in Julia code, you could in principle actually be interested in differentiating a function of a pointer. In this case, you will not be able to use <code>Mooncake.value_and_gradient!!</code> as this requires the use of <code>zero_tangent</code>. Instead, you will need to use lower-level (internal) functionality, such as <code>Mooncake.__value_and_gradient!!</code>, or use the rule interface directly.</p><p>Honestly, your best bet is just to avoid differentiating functions whose arguments are pointers if you can.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../developer_documentation/internal_docstrings/">« Internal Docstrings</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.8.0 on <span class="colophon-date" title="Thursday 9 January 2025 08:53">Thursday 9 January 2025</span>. Using Julia version 1.11.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
(4.0, (NoTangent(), [0.0, 1.0]))</code></pre><p><em><strong>The Solution</strong></em></p><p>This is only really a problem for tangent / fdata / rdata generation functionality, such as <code>zero_tangent</code>. As a work-around, AD testing functionality permits users to pass in <code>CoDual</code>s. So if you are testing something involving a pointer, you will need to construct its tangent yourself, and pass a <code>CoDual</code> to e.g. <code>Mooncake.TestUtils.test_rule</code>.</p><p>While pointers tend to be a low-level implementation detail in Julia code, you could in principle actually be interested in differentiating a function of a pointer. In this case, you will not be able to use <code>Mooncake.value_and_gradient!!</code> as this requires the use of <code>zero_tangent</code>. Instead, you will need to use lower-level (internal) functionality, such as <code>Mooncake.__value_and_gradient!!</code>, or use the rule interface directly.</p><p>Honestly, your best bet is just to avoid differentiating functions whose arguments are pointers if you can.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../developer_documentation/internal_docstrings/">« Internal Docstrings</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.8.0 on <span class="colophon-date" title="Thursday 9 January 2025 10:27">Thursday 9 January 2025</span>. Using Julia version 1.11.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,4 @@
D f [x] (\dot{x}) &amp;= [(D \mathcal{l} [g(x)]) \circ (D g [x])](\dot{x}) \nonumber \\
&amp;= \langle \bar{y}, D g [x] (\dot{x}) \rangle \nonumber \\
&amp;= \langle D g [x]^\ast (\bar{y}), \dot{x} \rangle, \nonumber
\end{align}\]</p><p>from which we conclude that <span>$D g [x]^\ast (\bar{y})$</span> is the gradient of the composition <span>$l \circ g$</span> at <span>$x$</span>.</p><p>The consequence is that we can always view the computation performed by reverse-mode AD as computing the gradient of the composition of the function in question and an inner product with the argument to the adjoint.</p><p>The above shows that if <span>$\mathcal{Y} = \RR$</span> and <span>$g$</span> is the function we wish to compute the gradient of, we can simply set <span>$\bar{y} = 1$</span> and compute <span>$D g [x]^\ast (\bar{y})$</span> to obtain the gradient of <span>$g$</span> at <span>$x$</span>.</p><h1 id="Summary"><a class="docs-heading-anchor" href="#Summary">Summary</a><a id="Summary-1"></a><a class="docs-heading-anchor-permalink" href="#Summary" title="Permalink"></a></h1><p>This document explains the core mathematical foundations of AD. It explains separately <em>what</em> is does, and <em>how</em> it goes about it. Some basic examples are given which show how these mathematical foundations can be applied to differentiate functions of matrices, and Julia <code>function</code>s.</p><p>Subsequent sections will build on these foundations, to provide a more general explanation of what AD looks like for a Julia programme.</p><h1 id="Asides"><a class="docs-heading-anchor" href="#Asides">Asides</a><a id="Asides-1"></a><a class="docs-heading-anchor-permalink" href="#Asides" title="Permalink"></a></h1><h3 id="*How*-does-Forwards-Mode-AD-work?"><a class="docs-heading-anchor" href="#*How*-does-Forwards-Mode-AD-work?"><em>How</em> does Forwards-Mode AD work?</a><a id="*How*-does-Forwards-Mode-AD-work?-1"></a><a class="docs-heading-anchor-permalink" href="#*How*-does-Forwards-Mode-AD-work?" title="Permalink"></a></h3><p>Forwards-mode AD achieves this by breaking down <span>$f$</span> into the composition <span>$f = f_N \circ \dots \circ f_1$</span>, where each <span>$f_n$</span> is a simple function whose derivative (function) <span>$D f_n [x_n]$</span> we know for any given <span>$x_n$</span>. By the chain rule, we have that</p><p class="math-container">\[D f [x] (\dot{x}) = D f_N [x_N] \circ \dots \circ D f_1 [x_1] (\dot{x})\]</p><p>which suggests the following algorithm:</p><ol><li>let <span>$x_1 = x$</span>, <span>$\dot{x}_1 = \dot{x}$</span>, and <span>$n = 1$</span></li><li>let <span>$\dot{x}_{n+1} = D f_n [x_n] (\dot{x}_n)$</span></li><li>let <span>$x_{n+1} = f(x_n)$</span></li><li>let <span>$n = n + 1$</span></li><li>if <span>$n = N+1$</span> then return <span>$\dot{x}_{N+1}$</span>, otherwise go to 2.</li></ol><p>When each function <span>$f_n$</span> maps between Euclidean spaces, the applications of derivatives <span>$D f_n [x_n] (\dot{x}_n)$</span> are given by <span>$J_n \dot{x}_n$</span> where <span>$J_n$</span> is the Jacobian of <span>$f_n$</span> at <span>$x_n$</span>.</p><div class="citation canonical"><dl><dt>[1]</dt><dd><div id="giles2008extended">M. Giles. <em>An extended collection of matrix derivative results for forward and reverse mode automatic differentiation</em>. Unpublished (2008).</div></dd><dt>[2]</dt><dd><div id="minka2000old">T. P. Minka. <em>Old and new matrix algebra useful for statistics</em>. See www. stat. cmu. edu/minka/papers/matrix. html <strong>4</strong> (2000).</div></dd></dl></div><section class="footnotes is-size-7"><ul><li class="footnote" id="footnote-note_for_geometers"><a class="tag is-link" href="#citeref-note_for_geometers">note_for_geometers</a>in AD we only really need to discuss differentiatiable functions between vector spaces that are isomorphic to Euclidean space. Consequently, a variety of considerations which are usually required in differential geometry are not required here. Notably, the tangent space is assumed to be the same everywhere, and to be the same as the domain of the function. Avoiding these additional considerations helps keep the mathematics as simple as possible.</li></ul></section></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../introduction/">« Introduction</a><a class="docs-footer-nextpage" href="../rule_system/">Mooncake.jl&#39;s Rule System »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.8.0 on <span class="colophon-date" title="Thursday 9 January 2025 08:53">Thursday 9 January 2025</span>. Using Julia version 1.11.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
\end{align}\]</p><p>from which we conclude that <span>$D g [x]^\ast (\bar{y})$</span> is the gradient of the composition <span>$l \circ g$</span> at <span>$x$</span>.</p><p>The consequence is that we can always view the computation performed by reverse-mode AD as computing the gradient of the composition of the function in question and an inner product with the argument to the adjoint.</p><p>The above shows that if <span>$\mathcal{Y} = \RR$</span> and <span>$g$</span> is the function we wish to compute the gradient of, we can simply set <span>$\bar{y} = 1$</span> and compute <span>$D g [x]^\ast (\bar{y})$</span> to obtain the gradient of <span>$g$</span> at <span>$x$</span>.</p><h1 id="Summary"><a class="docs-heading-anchor" href="#Summary">Summary</a><a id="Summary-1"></a><a class="docs-heading-anchor-permalink" href="#Summary" title="Permalink"></a></h1><p>This document explains the core mathematical foundations of AD. It explains separately <em>what</em> is does, and <em>how</em> it goes about it. Some basic examples are given which show how these mathematical foundations can be applied to differentiate functions of matrices, and Julia <code>function</code>s.</p><p>Subsequent sections will build on these foundations, to provide a more general explanation of what AD looks like for a Julia programme.</p><h1 id="Asides"><a class="docs-heading-anchor" href="#Asides">Asides</a><a id="Asides-1"></a><a class="docs-heading-anchor-permalink" href="#Asides" title="Permalink"></a></h1><h3 id="*How*-does-Forwards-Mode-AD-work?"><a class="docs-heading-anchor" href="#*How*-does-Forwards-Mode-AD-work?"><em>How</em> does Forwards-Mode AD work?</a><a id="*How*-does-Forwards-Mode-AD-work?-1"></a><a class="docs-heading-anchor-permalink" href="#*How*-does-Forwards-Mode-AD-work?" title="Permalink"></a></h3><p>Forwards-mode AD achieves this by breaking down <span>$f$</span> into the composition <span>$f = f_N \circ \dots \circ f_1$</span>, where each <span>$f_n$</span> is a simple function whose derivative (function) <span>$D f_n [x_n]$</span> we know for any given <span>$x_n$</span>. By the chain rule, we have that</p><p class="math-container">\[D f [x] (\dot{x}) = D f_N [x_N] \circ \dots \circ D f_1 [x_1] (\dot{x})\]</p><p>which suggests the following algorithm:</p><ol><li>let <span>$x_1 = x$</span>, <span>$\dot{x}_1 = \dot{x}$</span>, and <span>$n = 1$</span></li><li>let <span>$\dot{x}_{n+1} = D f_n [x_n] (\dot{x}_n)$</span></li><li>let <span>$x_{n+1} = f(x_n)$</span></li><li>let <span>$n = n + 1$</span></li><li>if <span>$n = N+1$</span> then return <span>$\dot{x}_{N+1}$</span>, otherwise go to 2.</li></ol><p>When each function <span>$f_n$</span> maps between Euclidean spaces, the applications of derivatives <span>$D f_n [x_n] (\dot{x}_n)$</span> are given by <span>$J_n \dot{x}_n$</span> where <span>$J_n$</span> is the Jacobian of <span>$f_n$</span> at <span>$x_n$</span>.</p><div class="citation canonical"><dl><dt>[1]</dt><dd><div id="giles2008extended">M. Giles. <em>An extended collection of matrix derivative results for forward and reverse mode automatic differentiation</em>. Unpublished (2008).</div></dd><dt>[2]</dt><dd><div id="minka2000old">T. P. Minka. <em>Old and new matrix algebra useful for statistics</em>. See www. stat. cmu. edu/minka/papers/matrix. html <strong>4</strong> (2000).</div></dd></dl></div><section class="footnotes is-size-7"><ul><li class="footnote" id="footnote-note_for_geometers"><a class="tag is-link" href="#citeref-note_for_geometers">note_for_geometers</a>in AD we only really need to discuss differentiatiable functions between vector spaces that are isomorphic to Euclidean space. Consequently, a variety of considerations which are usually required in differential geometry are not required here. Notably, the tangent space is assumed to be the same everywhere, and to be the same as the domain of the function. Avoiding these additional considerations helps keep the mathematics as simple as possible.</li></ul></section></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../introduction/">« Introduction</a><a class="docs-footer-nextpage" href="../rule_system/">Mooncake.jl&#39;s Rule System »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.8.0 on <span class="colophon-date" title="Thursday 9 January 2025 10:27">Thursday 9 January 2025</span>. Using Julia version 1.11.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
Loading

0 comments on commit 7b5067d

Please sign in to comment.