Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Refactor inference modules #143

Draft
wants to merge 60 commits into
base: main
Choose a base branch
from
Draft

WIP: Refactor inference modules #143

wants to merge 60 commits into from

Conversation

kaiserls
Copy link
Collaborator

@kaiserls kaiserls commented Nov 14, 2024

Description

This pull request addresses architectural limitations highlighted in issues #134 and #42. While the current design is effective for a stable prototype and generating results, it lacks flexibility for future extensions. Building on the refactoring efforts in #141 and #142, this PR restructures the framework to allow more flexible and modular usage.

Key Changes

  1. New Subpackages:
  • evaluation: Handles kernel density estimation and the mapping of densities between data and parameter spaces.
  • inferences: Manages different approaches to selecting points for density evaluation:
    • sampling: Dynamically generates points based on the density of previous points.
    • grids: Uses a predefined set of points, enabling straightforward operations like chunking for multiprocessing.
  • samplers: Encapsulates dependencies on specific samplers for sampling-based inference.
  1. Implementation of the Dependency Inversion Principle:
  • The dense_grid no longer depends on the InferenceTypes or calc_kernel_width. Instead it relies on the abstract KDE class, which serves as a callable interface.
  • The evaluate_density function no longer requires direct access to data, kernel_width, or the hardcoded eval_gauss_kde function. Instead, it operates via the KDE interface.
  • Enums such as InferenceType and DenseGridType are no longer used globally but only locally, in the inference and dense_grid respectively.

This pull request fixes #134 and allows to implement #42 quickly.

The dependency graph of eulerpi before the refactoring at commit 627ecab. Many of the dependencies are not visible, because this graph only contains modules, not classes.
eulerpi_old
Current dependency graph, the larger number of nodes is caused by splitting modules with multiple classes into separate files.
eulerpi
And a overview over the external dependencies
eulerpi_external

Question

  • Do we want/should support evaluation multiple slices at once, directly?
  • Should we reintroduce num_runs in the sampler The reason for the sub_runs was to have some fault tolerance. I am not sure if our package should take care of that or whether the details should be handled by the sampler. The user can still loop over the sampling by using continue_sampling.

Help wanted

  • Update / Rewrite the very complex result manager
  • Update the usage of the result manager

TODOs:

  • Finish the refactoring:
    • Simplify and unify the different inference functions
    • Bring the dependency on the concrete KDE, and Grid to the inference call.
  • Fix the documentation:
  • Check for performance regression
  • Write the changelog

Type of change

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please note/describe at least one tests that you ran to verify your changes.

  • Local tests

Checklist For Contributor

  • My code follows the style guidelines of this project (I installed pre-commit)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I wrote my changes in the changelog in the [unreleased] section
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes

Checklist For Maintainers

  • Continuous Integration (CI) is successfully running
  • Do we want to release/tag a new version? [✔️ / ❌]
    • If yes, add a release to the changelog and set the new version in pyproject.toml. After the merge, tag the release!

@castellinilinguini
Copy link
Collaborator

castellinilinguini commented Nov 18, 2024

I believe this is a great idea, this PR should greatly improve readability, maintainability & separation of concerns. Just to make sure I got this right: In the end, dense grids, sparse grids, and MCMC samplers would be implemented in the sampling submodule, right?

Edit: Nevermind, I just saw you did two different submodules for that, which makes sense ofc!

@kaiserls kaiserls changed the title Refactor inference modules WIP: Refactor inference modules Nov 19, 2024
@kaiserls kaiserls added enhancement New feature or request help wanted Extra attention is needed question Further information is requested and removed help wanted Extra attention is needed labels Nov 22, 2024
@kaiserls kaiserls mentioned this pull request Dec 11, 2024
12 tasks
Comment on lines 59 to 84
def combined_evaluation(param, model, data_transformation, kde, slice):
param, sim_res, logdensity = evaluate_log_density(
param, model, data_transformation, kde, slice
)
combined_result = np.concatenate([param, sim_res, np.array([logdensity])])
return logdensity, combined_result


def get_LogDensityEvaluator(
model: BaseModel,
data_transformation: DataTransformation,
kde: KDE,
slice: np.ndarray,
):
logdensity_blob_function = partial(
combined_evaluation,
model=model,
data_transformation=data_transformation,
kde=kde,
slice=slice,
)

param_dim = slice.shape[0]
output_dim = (1, param_dim + model.data_dim + 1)
return FunctionWithDimensions(
logdensity_blob_function, param_dim, output_dim
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion Result: Refactor as a class with name LogDensityEvaluator(FunctionWithDimensions)

Comment on lines 20 to 46
def combined_evaluation(param, model, data_transformation, kde, slice):
param, sim_res, density = evaluate_density(
param, model, data_transformation, kde, slice
)
combined_result = np.concatenate([param, sim_res, np.array([density])])
return combined_result


def get_DensityEvaluator(
model: BaseModel,
data_transformation: DataTransformation,
kde: KDE,
slice: np.ndarray,
):
combined_evaluation_function = partial(
combined_evaluation,
model=model,
data_transformation=data_transformation,
kde=kde,
slice=slice,
)

param_dim = slice.shape[0]
output_dim = param_dim + model.data_dim + 1
return FunctionWithDimensions(
combined_evaluation_function, param_dim, output_dim
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion Result: Refactor as a class with name DensityEvaluator(FunctionWithDimensions)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion Result: Offer functional interface inference, but use object oriented approach internally

Comment on lines 101 to 113
def sampling_inference(
model: BaseModel,
data_transformation: DataTransformation,
kde: KDE,
slice: np.ndarray,
result_manager: ResultManager,
num_processes: int,
sampler: Sampler = None,
num_walkers: int = 10,
num_steps: int = 2500,
num_burn_in_samples: Optional[int] = None,
thinning_factor: Optional[int] = None,
get_walker_acceptance: bool = False,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion Result: Reintroduce subruns

@kaiserls kaiserls removed the enhancement New feature or request label Jan 8, 2025
@kaiserls kaiserls added enhancement New feature or request and removed question Further information is requested labels Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Other density estimation possibilities
2 participants