-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Named tuples for parameters #42
Comments
Internally to the algorithms, as you say we treat parameter instances as columns in a single ensemble matrix. This is because once we first initialize prior mean and covariances, and have an initial ensemble, we work with Gaussian methods (completely determined by this information). Therefore everything is treated as vectors in R^d and actions are matrix-vector type operations. We have
In general applications parameters may be defined as multidimensional too e.g with values I'm sure the user interface could be cleaner as I certainly didnt optimize it. But internally I felt it was quite robust and doesnt require more functionality than Ensemble Kalman methods require (e.g matrices and vectors with consistently enforced dimensions) |
I think what I'm asking is what methods need to be defined for a parameter type. We can relax the type restrictions in the struct definitions, and design an interface based on methods. Then many types could be used by savvy users to store parameters, provided those types define the right methods. Flexible parameter types might be useful for complicated loss functions (such as ours). |
I'll experiment a bit and open a PR that will hopefully illustrate what I'm talking about! |
Just to provide a simple example, broadcasting works for any subtype of |
I think what I am saying is that there are many julia types that may be used to represent "vectors in R^d". I think it would help to write the software in a way that supports many vector-like types --- not just struct SquaresVector <: AbstractArray{Int, 1}
count::Int
end
Base.size(S::SquaresVector) = (S.count,)
Base.IndexStyle(::Type{<:SquaresVector}) = IndexLinear()
Base.getindex(S::SquaresVector, i::Int) = i*i Note that julia> s = SquaresVector(4)
4-element SquaresVector:
1
4
9
16
julia> s[s .> 8]
2-element Vector{Int64}:
9
16
julia> s + s
4-element Vector{Int64}:
2
8
18
32
julia> sin.(s)
4-element Vector{Float64}:
0.8414709848078965
-0.7568024953079282
0.4121184852417566
-0.2879033166650653 This shows why we don't want to overrestrict types in struct definitions --- allowing flexible types generates the possibility for powerful behaviors. "Alternative array types" like Essentially, we almost never should restrict the types of objects, except in very special cases (which probably do not occur in our software). |
@glwagner We should move away from deleting the typing in arguments or objects of structs. It adds flexibility at the cost of interpretability of the codebase. What we should do is substitute the types by more general (possibly abstract) types. |
I think changing the typing to more abstract types would really benefit the codebase. However, unit testing becomes slightly more difficult (there are a lot of corner cases). I am not familiar with this kind of typing in Julia, but we should implement types similar to those in the |
You cannot capture what @glwagner is promoting with abstract types because the code used here is a composition of required type traits / interfaces (ex. generic iteration, get/set index, etc.). No one abstract type hierarchy will necessarily capture the composition of traits that the package needs. In general that is why you rely on dynamic polymorphic behavior (ex. no contraint on types) and raise error messages when a particular type instance does not implement some method. This makes the code compositional and easier to extend. Because Julia doesn't have explicit declaration of type traits you rely on implicit runtime behavior (like all dynamic languages), comments, and a good test suite. |
I am not sure I understand why types cannot capture everything we need. If you need to say that an argument allows generic iteration, it should be an iterable. If it needs to be used as a dict key, it is a hashable. An example of how this is possible, while maintaining extreme generality of the inputs in a dynamically typed language is python's xarray, see some of the function handles here, for instance. Typing helps a lot with debugging, which is kind of the weakest point of Julia. It's either that or adding asserts in every function. |
For applications with large numbers of parameters input into complex models, it's useful (even necessary) to use objects like
NamedTuple
that associate parameter names with values to hold / transport lists of parameters.However, as far as I can tell, parameters are constrained to be vectors. Throughout the code in fact, many data structures seem constrained to
Array
orVector
. Eg, this code does not work:and I get
Is it necessary to constrain parameters to type
Vector
? It seems like it should be possible to generically support any iterable object (or objects that support broadcasting?) In the case that we need to performLinearAlgebra
operations, we can constructor the appropriate matrices or vectors from iterable containers as needed:Allocations could even be minimized, though I think this is usually unnecessary because typical applications of
EnsembleKalmanProcesses.jl
have computations dominated by the cost of evaluating a loss function.I'm happy to get to work on refactoring the internals to support more general parameter containers if someone can identify the minimal requirements for the parameters object. It's tough to figure out what methods would have to be defined without detailed knowledge of the code because the
struct
definitions are over-constrained, preventing experimentation.The text was updated successfully, but these errors were encountered: