Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive allocations for derivatives layered media in spatial domain #21

Open
heltonmc opened this issue Mar 8, 2022 · 3 comments
Open

Comments

@heltonmc
Copy link
Owner

heltonmc commented Mar 8, 2022

The number of allocations is dependent on the variable you take the derivative with respect to. It is ok with respect to z so the flux calculations are not affected by this however for gradient calculations this is majorly affected... This could be a regression from when we switched over to tuples but unsure

julia> @benchmark fluence_DA_Nlay_cylinder_CW(0.1, (0.1, 0.1), (10.0, 10.0), 1.0, (1.0, 1.0), (1.0, 9.0), 10.0, 0.0, 1000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  101.500 μs  322.333 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     105.875 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   105.726 μs ±   5.836 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▁▇▇▆      ▅▅▄█▇▄▂▁▁▁▁▂▃▃▂▂▂▃▂▁                               ▂
  ██████▇▆▇▇███████████████████████▆▇▆▆▇▅▅▆▅▇▆▇▅▇▆▅▆▅▅▃▆▃▁▆▅▅▄▅ █
  102 μs        Histogram: log(frequency) by time        120 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.derivative(dz -> fluence_DA_Nlay_cylinder_CW(0.1, (0.1, 0.1), (10.0, 10.0), 1.0, (1.0, 1.0), (1.0, 9.0), 10.0, dz, 1000), 0.0)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  103.500 μs  193.750 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     104.209 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   104.654 μs ±   2.356 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

     ▃█▇▇▆▅▂                ▃▂▂▁                                ▂
  █▄▁███████▇▆▆▄▄▃▃▆▆▆▆▄▅▄▁██████▇▄▃▆▆▅▄▆▆▇█▇▆▆▅▄▄▄▄▃▁▃▃▄▃▄▃▄▅▅ █
  104 μs        Histogram: log(frequency) by time        111 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.derivative(dz -> fluence_DA_Nlay_cylinder_CW(0.1, (dz, 0.1), (10.0, 10.0), 1.0, (1.0, 1.0), (1.0, 9.0), 10.0, 0.0, 1000), 0.1)
BenchmarkTools.Trial: 7685 samples with 1 evaluation.
 Range (min  max):  593.875 μs    3.413 ms  ┊ GC (min  max): 0.00%  79.90%
 Time  (median):     607.083 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   649.008 μs ± 304.502 μs  ┊ GC (mean ± σ):  5.96% ± 10.01%

  █▃                                                            ▁
  ██▇▆▄▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▅█ █
  594 μs        Histogram: log(frequency) by time       3.02 ms <

 Memory estimate: 765.67 KiB, allocs estimate: 29002.
@heltonmc
Copy link
Owner Author

heltonmc commented Mar 8, 2022

The problem seems to come within the kernel function but not green...

julia> @benchmark ForwardDiff.derivative(dz->LightPropagation._green_Nlaycylin_top(1.0, (0.1, 0.1), (0.4, 0.4), dz, 0.1, (1.2, 1.2), (1.0, 1.0), (1.0, 1.0), 2), 0.0)
BenchmarkTools.Trial: 10000 samples with 981 evaluations.
 Range (min  max):  61.714 ns  99.388 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     62.011 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   62.243 ns ±  1.413 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁▃▅█▆▄▆▄▁                             ▁▂                    ▂
  █████████▄▃▃▄▄▁▃▄▁▃▃▃▃▄▃▄▃▄▁▁▁▄▅▄▄▃▃▅▇██▇▇█▅▃▃▄▁▃▃▁▄▁▁▄▄▄▅▅ █
  61.7 ns      Histogram: log(frequency) by time      66.3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.derivative(dz->LightPropagation._green_Nlaycylin_top(1.0, (dz, 0.1), (0.4, 0.4), 0.0, 0.1, (1.2, 1.2), (1.0, 1.0), (1.0, 1.0), 2), 0.1)
BenchmarkTools.Trial: 10000 samples with 984 evaluations.
 Range (min  max):  58.604 ns  154.726 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     58.943 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   59.409 ns ±   2.693 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▃         ▁▃                                               ▁
  ███▇▆▅▆▆▆▇▆▆███▆▄▆▇▇▇▆▆▅▅▄▁▄▄▁▁▄▃▄▄▅▄▄▅▅▅▅▅▆▅▄▅▅▆▆▅▄▄▄▅▅▅▄▆▅ █
  58.6 ns       Histogram: log(frequency) by time      72.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.derivative(dz->LightPropagation._kernel_fluence_DA_Nlay_cylinder(1.0, (0.5, 0.5), (0.1, 0.1), 20.0, (0.2, 0.2), dz, 0.1, (1.0, 9.0), (1.0, 1.0), J0_ROOTS[1:1000], LightPropagation._green_Nlaycylin_top, 2), 0.0)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  110.875 μs   1.196 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     113.875 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   115.214 μs ± 13.444 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▂▂▄    ▅█      ▁▃                                            
  ▂███▄▂▂▅███▄▃▂▂▆██▆▃▃▂▃▂▃▃▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  111 μs          Histogram: frequency by time          129 μs <

 Memory estimate: 7.94 KiB, allocs estimate: 1.

julia> @benchmark ForwardDiff.derivative(dz->LightPropagation._kernel_fluence_DA_Nlay_cylinder(1.0, (0.5, 0.5), (dz, 0.1), 20.0, (0.2, 0.2), 0.0, 0.1, (1.0, 9.0), (1.0, 1.0), J0_ROOTS[1:1000], LightPropagation._green_Nlaycylin_top, 2), 0.1)
BenchmarkTools.Trial: 7622 samples with 1 evaluation.
 Range (min  max):  596.375 μs    3.752 ms  ┊ GC (min  max): 0.00%  82.18%
 Time  (median):     609.312 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   654.571 μs ± 330.757 μs  ┊ GC (mean ± σ):  6.39% ± 10.12%

  █▃                                                            ▁
  ██▇▅▄▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▅█ █
  596 μs        Histogram: log(frequency) by time       3.24 ms <

 Memory estimate: 773.62 KiB, allocs estimate: 29004.

@heltonmc
Copy link
Owner Author

heltonmc commented Mar 8, 2022

Culprit function attached... probably due to the preallocation of phi with eltype of mua

function _kernel_fluence_DA_Nlay_cylinder::AbstractFloat, D, μa, a, zb, z, z0, l, n_med, besselroots, green, N)
ϕ = zero(eltype(μa))
ϕ_tmp = zero(eltype(μa))
apzb = inv(a + zb[1])
@inbounds for ind in eachindex(besselroots)
ϕ_tmp = green(besselroots[ind] * apzb, μa, D, z, z0, zb, l, n_med, N)
ϕ_tmp *= besselj0(besselroots[ind] * apzb * ρ)
ϕ_tmp /= J1_J0ROOTS_2[ind] # replaces (besselj1(besselroots[ind]))^2
ϕ += ϕ_tmp
end
return ϕ /* (a + zb[1])^2)
end

probably no reason to preallocate phi_tmp here

Edit: It is not from the preallocation though that does create a type instability

@heltonmc
Copy link
Owner Author

heltonmc commented Mar 9, 2022

Specifically its this line

 ϕ_tmp = green(besselroots[ind] * apzb, μa, D, z, z0, zb, l, n_med, N) 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant