TemplateExpression slowdown over time relative to Custom Loss Function #793

gm89uk · 2024-12-29T14:17:34Z

gm89uk
Dec 29, 2024

TemplateExpressions gives so much flexibility that in theory you can incorporate a custom loss function within a structure function.
However, for whatever reason, there seems to be a significant slow down after running for several hours compared to running otherwise functionally identical code to a custom loss function.

TemplateExpression code:

using Statistics #for mean
structure = TemplateStructure{(:f,)}(
  ((; f), (x1, x2, x3, x4, x5, x6, y, cat)) -> begin
    o = f(x1, x2, x3, x4, x5, x6)
    if !o.valid
        return ValidVector(Vector{Float32}(undef, length(o.x)), false)
    end
    for varidx in 1:3 #checking monotonicity for first 3 variables
    	grad_column = D(f, varidx)(x1, x2, x3, x4, x5, x6).x
    	if !(all(grad_column .>= 0) || all(grad_column .<= 0))
        	return ValidVector(fill(Float32(1e9), length(o.x)), true)
    	end
    end
    residuals = o - s
    unique_groups = unique(Int.(cat.x))
    mean_residuals_by_group = Dict(group => _mean(residuals.x[cat.x .== group]) for group in unique_groups)
    offsets = [mean_residuals_by_group[cat.x[i]] for i in eachindex(o.x)]
    return ValidVector(o.x .- offsets, o.valid && residuals.valid)
  end
)
model = SRRegressor(
    niterations=1000000,
    binary_operators=[+,-,*,/],
    maxsize=60,
    bumper=true,
    turbo=true,
    populations=18,
    expression_type = TemplateExpression,
    expression_options = (; structure),
    population_size=100,
    parsimony = 0.01,
    batching=true,
)

To achieve the same with a custom loss function:

using Statistics #for mean

function loss_fnc(tree, dataset::Dataset{T,L}, options, idx) where {T,L}    
    # Extract data for the given indices
    X = idx === nothing ? dataset.X : dataset.X[:, idx]
    y = idx === nothing ? dataset.y : view(dataset.y, idx)
    prediction, grad, complete = eval_grad_tree_array(tree, X, options; variable=true)
    if !complete
        return L(Inf)
    end
    if any(row -> !(all(row .>= 0) || all(row .<= 0)), eachrow(grad[1:3, :])) #checking monotonicity for first 3 variables
        return 1e09
    end
    # Calculate residuals
    residuals = prediction .- y
    weights = idx === nothing ? dataset.weights : view(dataset.weights, idx) #this carries the categorial variable
    mean_residuals_by_group = Dict{L, T}()
    unique_groups = unique(weights)
    for group in unique_groups
        group_residuals = residuals[weights .== group]
        mean_residuals_by_group[group] = mean(group_residuals)
    end
    for i in eachindex(y)
        group_type = weights[i]
        prediction[i] -= mean_residuals_by_group[group_type]
    end    

    # Calculate mean squared error (MSE) with adjusted predictions
    mse = mean((prediction .- y).^2)
    
    # Return final loss value (MSE + Penalty)
    return mse

end
model = SRRegressor(
    niterations=1000000,
    binary_operators=[+,-,*,/],
    maxsize=60,
    bumper=true,
    turbo=true,
    populations=18,
    population_size=100
    parsimony = 0.01,
    batching=true,
    loss_function = loss_fnc,
)

Both start with around 50-60 days remaining, which then rises to around 150 for the custom loss function (after 8 hours or so) and falls back to 50 days remaining.

TemplateStructure: After running for 8 hours, the ETA increases to 442 days after initially being 53. Total CPU usage drops to around 17-20%. This has consistently been the case after I tested both codes out over a few weeks.

Key difference in the code:
Custom_loss_function calculates grad for all variables at once and only checks monotonicity for the first 3.
Structure: Uses D() to check the grad for first 3 variables

However, both start off at a similar speed and convergence, (if anything TemplateExpression structure code is quicker initially), TE slows down dramatically. I've set all input variables to float32 to be consistent for both as I initially thought one might be using Float64. Any ideas why the TemplateExpression code starts quickly and then dramatically slows down would be very helpful. Personally I would much prefer to work with TemplateExpression if possible!

Answered by MilesCranmer

Jan 1, 2025

Likely fixed with MilesCranmer/SymbolicRegression.jl#399

View full answer

MilesCranmer · 2024-12-30T04:20:24Z

MilesCranmer
Dec 30, 2024
Maintainer

Thanks for the report! It could be related to the garbage collection issue I reported which should be fixed in Julia 1.11.3: JuliaLang/julia#56759. You could try Julia 1.10 to see if the problem gets better? On 1.10, that garbage collection issue still technically exists, it's just not getting hit as frequently, so the issue will be smaller.

By the way, note that 1e9 is technically a Float64 - so I would make the following fix to the loss function code:

    if any(row -> !(all(row .>= 0) || all(row .<= 0)), eachrow(grad[1:3, :])) #checking monotonicity for first 3 variables
-       return 1e09
+       return 1f09
    end

1f9 is a Float32 so the type stability will be preserved. I'm not sure this will impact things though.

Other tips:

Avoid the extra allocation:

    if !o.valid
-       return ValidVector(Vector{Float32}(undef, length(o.x)), false)
+      return o
    end

Also avoid allocation here:

-   	if !(all(grad_column .>= 0) || all(grad_column .<= 0))
+   	if !(all(g -> g >= 0, grad_column) || all(g -> g <= 0, grad_column))
        	return ValidVector(fill(Float32(1e9), length(o.x)), true)
    	end

(It will scan the >= 0 across the array and exit early, rather than computing >= 0 for the entire array)

0 replies

MilesCranmer · 2024-12-30T04:21:42Z

MilesCranmer
Dec 30, 2024
Maintainer

Ah, one other tricky thing... I realised that turbo=true, bumper=true will not be passed through to the evaluation when using TemplateExpression (I'll need to figure out how to do this... Maybe it requires embedding the turbo and bumper inside the TemplateExpression and ComposableExpressions types). I do know that bumper=true results in improved memory usage (and less garbage collection), so perhaps that is why the TemplateExpression is hitting issues after some number of hours while the custom loss function lasts a lot longer.

Also could explain why TemplateExpression is slower in general (?), simply because it doesn't use turbo evaluation!

0 replies

gm89uk · 2024-12-30T18:28:37Z

gm89uk
Dec 30, 2024
Author

Great, thank you very much for the tips. I've adjusted the code as you recommended and it's very interesting to hear that bumper and turbo potentially have such a cumulative effect on prolonged runs.

0 replies

MilesCranmer · 2025-01-01T10:11:20Z

MilesCranmer
Jan 1, 2025
Maintainer

Likely fixed with MilesCranmer/SymbolicRegression.jl#399

0 replies

gm89uk · 2025-01-05T22:46:39Z

gm89uk
Jan 5, 2025
Author

Thanks Miles,

This is resolved now, and I have TemplateExpressions running even faster than the custom loss function with v1.5.2. Thank you for the fix. I needed to update a lot of packages with ] up.

I also replaced this:

    unique_groups = unique(Int.(cat.x))
    mean_residuals_by_group = Dict(group => _mean(residuals.x[cat.x .== group]) for group in unique_groups)
    offsets = [mean_residuals_by_group[cat.x[i]] for i in eachindex(o.x)]
    return ValidVector(o.x .- offsets, o.valid && residuals.valid)

with:

    prediction = o.x
    mean_residuals_by_group = Dict{Float32,Float32}()
    unique_groups = unique(cat.x)
    for group in unique_groups
        group_residuals = residuals[cat.x .== group]
        mean_residuals_by_group[group] = mean(group_residuals)
    end
    for i in eachindex(prediction)
        group_type = cat.x[i]
        prediction[i] -= mean_residuals_by_group[group_type]
    end  
    return ValidVector(prediction, true) #o already checked before

This made it much faster (unsure why).
Also I realised D(...) gave me the flexibility to only check monotonicity exactly for the variable that was likely to lose it at higher complexity (which turned out to be faster than a single variable gradient check with eval_diff_tree_array in the custom loss function.

both those bits of code will be redundant with templateparametric expressions anyway.

Thanks again for your help and rapid turnaround for including bumper and turbo!

1 reply

MilesCranmer Jan 5, 2025
Maintainer

Awesome!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TemplateExpression slowdown over time relative to Custom Loss Function #793

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

TemplateExpression slowdown over time relative to Custom Loss Function #793

gm89uk Dec 29, 2024

Replies: 5 comments · 1 reply

MilesCranmer Dec 30, 2024 Maintainer

MilesCranmer Dec 30, 2024 Maintainer

gm89uk Dec 30, 2024 Author

MilesCranmer Jan 1, 2025 Maintainer

gm89uk Jan 5, 2025 Author

MilesCranmer Jan 5, 2025 Maintainer

gm89uk
Dec 29, 2024

Replies: 5 comments 1 reply

MilesCranmer
Dec 30, 2024
Maintainer

MilesCranmer
Dec 30, 2024
Maintainer

gm89uk
Dec 30, 2024
Author

MilesCranmer
Jan 1, 2025
Maintainer

gm89uk
Jan 5, 2025
Author

MilesCranmer Jan 5, 2025
Maintainer