Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run multiple simulations in // from python #64

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

PhilipDeegan
Copy link
Owner

@PhilipDeegan PhilipDeegan commented Nov 20, 2022

Summary by CodeRabbit

New Features:

  • Added a Simulators class to manage multiple simulations, supporting parallel execution and shared memory communication.
  • Introduced a finished() method in the simulator class to check if a simulation has completed.
  • Added a compare method in the FieldData class for data comparison and a meshgrid method for generating a meshgrid.
  • Enhanced GetAllAvailableQties method to handle exceptions and ensure continued execution.

Refactor:

  • Modified cpp_lib function to cache the imported module, improving performance.
  • Updated all_assert_sorted function in particles.py for better precision control.
  • Adjusted the getHierarchy function across multiple test files to include an additional parameter ndim.

Tests:

  • Renamed test classes for better clarity between 1D and 2D tests.
  • Added a unique_diag_dir_for_test_case method for generating unique diagnostic directory paths.

Chores:

  • Added scripts for building and comparing different branches or commits.

@PhilipDeegan PhilipDeegan force-pushed the simulators branch 2 times, most recently from 2f5d825 to 1dc320f Compare November 27, 2022 22:06
@PhilipDeegan PhilipDeegan force-pushed the simulators branch 2 times, most recently from 017173f to 50d8b6f Compare December 18, 2022 21:48
Copy link

coderabbitai bot commented Oct 31, 2023

Walkthrough

Walkthrough

The PyPHARE codebase has undergone several enhancements, including improvements to simulation management, comparison features, and build processes. Notably, the Simulator and Simulators classes have been updated with new methods for handling simulation states and shared memory. Additionally, the FieldData class now supports comparison operations, and particle velocity comparisons have been refined with new tolerance parameters. The build scripts have been improved to better handle branches and commits, and the build process now optionally uses Ninja. Furthermore, global variables have been introduced in test scripts to standardize simulation parameters.

Changes

File(s) Change Summary
.../cpp/__init__.py Added _globals for module caching; updated cpp_lib function for module loading message.
.../pharesee/hierarchy.py Added compare and __eq__ methods to FieldData; added hierarchy_compare function.
.../pharesee/particles.py Replaced atol with rtol and atol in velocity comparison.
.../pharesee/run.py Updated GetAllAvailableQties method for broader exception handling and loop usage.
.../simulator/simulator.py Added finished method to Simulator.
.../simulator/simulators.py Added SharedSimulationStateEnum, shared memory management, environment/path functions; introduced Simulators class.
tests/functional/alfven_wave/alfven_wave1d.py Added global variables; modified Simulation function; added init to protons.
tools/build_compare_2_branches.py, tools/build_compare_top_2_commits.py Improved build and simulation comparison scripts with new imports, parameter settings, and function calls.
tools/python3/cmake.py Updated build functions to use Ninja and removed use_ninja parameter.
tools/python3/git.py Added git_branch_reset_at_exit function for branch cleanup.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ?


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • You can reply to a review comment from CodeRabbit.
  • You can also chat with CodeRabbit bot directly around the specific lines of code or files in the PR by tagging @coderabbitai in a new comment.
  • You can tag @coderabbitai in a PR comment and ask questions about the PR and the codebase.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 25

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 4bd2ae3 and 3e1349f.
Files selected for processing (24)
  • pyphare/pyphare/cpp/init.py (1 hunks)
  • pyphare/pyphare/pharesee/hierarchy.py (2 hunks)
  • pyphare/pyphare/pharesee/particles.py (1 hunks)
  • pyphare/pyphare/pharesee/run.py (1 hunks)
  • pyphare/pyphare/simulator/simulator.py (1 hunks)
  • pyphare/pyphare/simulator/simulators.py (1 hunks)
  • tests/functional/alfven_wave/alfven_wave1d.py (4 hunks)
  • tests/simulator/init.py (1 hunks)
  • tests/simulator/advance/test_fields_advance_1d.py (2 hunks)
  • tests/simulator/advance/test_fields_advance_2d.py (2 hunks)
  • tests/simulator/initialize/test_fields_init_1d.py (1 hunks)
  • tests/simulator/initialize/test_fields_init_2d.py (3 hunks)
  • tests/simulator/initialize/test_particles_init_1d.py (3 hunks)
  • tests/simulator/initialize/test_particles_init_2d.py (2 hunks)
  • tests/simulator/refinement/test_2d_10_core.py (1 hunks)
  • tests/simulator/test_advance.py (7 hunks)
  • tests/simulator/test_initialization.py (10 hunks)
  • tests/simulator/test_restarts.py (6 hunks)
  • tools/build_compare_2_branches.py (1 hunks)
  • tools/build_compare_2_branches.sh (1 hunks)
  • tools/build_compare_top_2_commits.py (1 hunks)
  • tools/build_compare_top_2_commits.sh (1 hunks)
  • tools/python3/cmake.py (1 hunks)
  • tools/python3/git.py (2 hunks)
Files skipped from review due to trivial changes (7)
  • pyphare/pyphare/cpp/init.py
  • pyphare/pyphare/simulator/simulator.py
  • tests/simulator/initialize/test_fields_init_1d.py
  • tests/simulator/initialize/test_particles_init_2d.py
  • tools/build_compare_2_branches.sh
  • tools/build_compare_top_2_commits.py
  • tools/build_compare_top_2_commits.sh
Additional comments: 44
pyphare/pyphare/pharesee/particles.py (1)
  • 194-201: The changes made to the all_assert_sorted function seem to be reasonable. The addition of the rtol parameter to the np.testing.assert_allclose calls for part1.deltas and part2.deltas with a value of 0 ensures that the relative tolerance between the two arrays is 0, meaning they must be exactly equal within the absolute tolerance. The change of the atol parameter for the np.testing.assert_allclose calls for part1.v and part2.v to 1e-12 makes the absolute tolerance stricter, which could be necessary depending on the precision required in your application. However, it's important to ensure that these changes don't cause any unintended side effects in your tests.
tests/simulator/__init__.py (3)
  • 240-241: The run method has been modified to set the _outcome attribute and then call the parent class's run method. This change seems to be fine, but ensure that the _outcome attribute is used correctly elsewhere in the code.

  • 243-247: The unique_diag_dir_for_test_case method has been added. This method generates a unique diagnostic directory path. This is a good practice for avoiding conflicts when running multiple tests in parallel.

  • 240-251: > Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [249-262]

The clean_up_diags_dirs method has been added. This method deletes the diagnostic directories after the tests have run. This is a good practice for cleaning up resources after tests. However, it's important to ensure that the directories are not needed for any post-test analysis or debugging.

tests/simulator/initialize/test_particles_init_1d.py (3)
  • 26-32: The class name change from InitializationTest to Initialization1DTest is appropriate as it better reflects the purpose of the class. Ensure that all references to this class in the codebase have been updated.

  • 45-54: The ndim parameter has been removed from the getHierarchy function call. This change is acceptable if the getHierarchy function no longer requires this parameter. Ensure that the function definition has been updated accordingly and that all calls to this function throughout the codebase have been updated to match the new signature.

  • 85-98: The ndim parameter has been removed from the getHierarchy function call. This change is acceptable if the getHierarchy function no longer requires this parameter. Ensure that the function definition has been updated accordingly and that all calls to this function throughout the codebase have been updated to match the new signature.

tests/functional/alfven_wave/alfven_wave1d.py (1)
  • 37-44: The Simulation instantiation has been updated to use the global variables TIME_STEP_NBR and TIMESTEP. Ensure that these changes do not affect the behavior of the simulation. Also, consider adding comments to explain the purpose of these variables.

128:
The config function now returns gv.sim. Ensure that all calls to this function throughout the codebase have been updated to handle the returned value.

pyphare/pyphare/pharesee/run.py (1)
  • 339-343: The loop is trying to call each function with the time argument. If any of these function calls fail, it simply passes and continues with the next function. This could potentially hide errors and lead to unexpected behavior. If these functions are expected to raise specific exceptions under normal circumstances, those exceptions should be caught and handled explicitly. If they're not expected to raise exceptions, then allowing the exception to propagate might be the correct behavior.
tests/simulator/initialize/test_fields_init_2d.py (3)
  • 5-11: The import of numpy and matplotlib is fine. However, ensure that matplotlib is used in the code, otherwise, it's an unnecessary import.

  • 19-25: The renaming of the class InitializationTest to Initialization2DTest is fine. Ensure that all references to this class in the codebase have been updated.

  • 37-43: The modification of the method test_density_decreases_as_1overSqrtN to accept an additional argument cells is fine. Ensure that all calls to this method throughout the codebase have been updated to match the new signature.

tools/python3/cmake.py (3)
  • 13-26: The function make_config_str has been modified to use the global variable _SAMRAI_DIR. Ensure that this change does not affect other parts of the code that call this function.

  • 28-32: The function config has been modified to use the global variable _SAMRAI_DIR. Ensure that this change does not affect other parts of the code that call this function.

  • 35-36: The function build has been modified to use the global variable _USE_NINJA. Ensure that this change does not affect other parts of the code that call this function.

tools/python3/git.py (3)
  • 2-2: The atexit module is imported to register exit handlers. This is a good practice for cleaning up resources or resetting states when the script exits.

  • 32-37: The run() function calls now include the check=True argument. This means that if the subprocess call returns a non-zero exit code, a CalledProcessError will be raised. This is a good practice for error handling as it allows the script to fail fast when a subprocess call fails.

  • 39-44: The git_branch_reset_at_exit() function is a new addition. It registers an exit handler that checks out the current Git branch when the script exits. This is a good practice for scripts that change the Git branch during their execution, as it ensures that the Git branch is reset to its original state even if the script fails or is interrupted.

tests/simulator/test_restarts.py (5)
  • 193-209: The variable dim has been renamed to ndim for clarity. Ensure that all references to this variable have been updated throughout the codebase.

  • 224-230: The function unique_diag_dir_for_test_case is now called with different arguments. Ensure that the function can handle these new arguments correctly.

  • 267-284: The variable dim has been renamed to ndim for clarity. Ensure that all references to this variable have been updated throughout the codebase.

  • 299-307: The function unique_diag_dir_for_test_case is now called with different arguments. Ensure that the function can handle these new arguments correctly.

  • 347-360: The function unique_diag_dir_for_test_case is now called with different arguments. Ensure that the function can handle these new arguments correctly.

tests/simulator/advance/test_fields_advance_2d.py (2)
  • 34-46: The ndim parameter is now passed to the getHierarchy function. Ensure that the function definition has been updated to accept this new parameter and that it is used correctly within the function.

  • 59-75: The ndim parameter is also passed to the getHierarchy function here. The same considerations apply as in the previous comment.

pyphare/pyphare/pharesee/hierarchy.py (4)
  • 171-185: The compare method in the FieldData class is well implemented. It checks if the other object is an instance of FieldData, asserts that their names are the same, and then compares their datasets. It handles NaN values correctly by creating a mask that excludes them. It uses np.testing.assert_allclose in debug mode to compare the datasets, and np.allclose otherwise. If the comparison fails, it raises an exception with a helpful error message.

  • 188-198: The meshgrid method is also well implemented. It creates a meshgrid based on the dimensions of the FieldData instance and returns a selected subset of the meshgrid if a selection is provided.

  • 201-202: The __eq__ method is overridden to use the compare method for equality comparison. This is a good practice as it allows for more complex comparison logic.

  • 1715-1750: The hierarchy_compare function is well implemented. It checks if the two objects are instances of PatchHierarchy, compares their dimensions and domain boxes, compares their time hierarchies, and then compares their patch levels and patch datas. If any of these comparisons fail, it returns False. If all comparisons pass, it returns True.

tests/simulator/test_advance.py (4)
  • 38-41: The getHierarchy function signature has been updated to include a new parameter ndim. Ensure that all calls to this function throughout the codebase have been updated to match the new signature.

  • 55-58: The diag_outputs parameter is now being set to the return value of the unique_diag_dir_for_test_case function. This change seems to be intended to ensure unique diagnostic directory paths for each test case. Ensure that this change does not affect any other parts of the codebase that rely on the previous value of diag_outputs.

  • 769-773: The getHierarchy function is being called with the updated signature. This is consistent with the changes made to the function signature.

  • 56-59: The global_vars.sim is being set to None after the diag_outputs is set. This seems to be a way to reset the simulation state before each test. Ensure that this does not have any unintended side effects on the tests.

tests/simulator/advance/test_fields_advance_1d.py (2)
  • 33-45: The ndim parameter is now passed to the getHierarchy function. Ensure that the function definition has been updated to accept this new argument and that all calls to this function throughout the codebase have been updated to match the new signature.

  • 57-73: The ndim parameter is now passed to the getHierarchy function. Ensure that the function definition has been updated to accept this new argument and that all calls to this function throughout the codebase have been updated to match the new signature.

tests/simulator/test_initialization.py (9)
  • 36-40: The getHierarchy function now accepts an additional argument ndim. Ensure that all calls to this function throughout the codebase have been updated to match the new signature.

  • 52-54: The diag_outputs argument in the getHierarchy function now has a default value of an empty string. This change is acceptable as long as the function can handle an empty string as a valid value for diag_outputs.

  • 266-269: The _test_B_is_as_provided_by_user function now accepts an additional argument dim. Ensure that all calls to this function throughout the codebase have been updated to match the new signature.

  • 331-336: The _test_bulkvel_is_as_provided_by_user function now accepts an additional argument dim. Ensure that all calls to this function throughout the codebase have been updated to match the new signature.

  • 459-464: The _test_density_decreases_as_1overSqrtN function now accepts additional arguments dim, nbr_particles, and cells. Ensure that all calls to this function throughout the codebase have been updated to match the new signature.

  • 529-554: The _test_density_decreases_as_1overSqrtN function has been significantly modified. It now includes a loop that iterates over nbr_particles and calls getHierarchy with different parameters. This change seems to be a performance optimization, but it's important to verify that the new logic is correct and that it doesn't introduce any new issues.

  • 612-619: The _test_nbr_particles_per_cell_is_as_provided function now accepts an additional argument dim. Ensure that all calls to this function throughout the codebase have been updated to match the new signature.

  • 652-656: The _test_domainparticles_have_correct_split_from_coarser_particle function now accepts an additional argument ndim. Ensure that all calls to this function throughout the codebase have been updated to match the new signature.

  • 702-708: The _test_patch_ghost_on_refined_level_case function now accepts an additional argument dim. Ensure that all calls to this function throughout the codebase have been updated to match the new signature.

Comment on lines 18 to +24
import numpy as np
mpl.use('Agg')

# Override if you do want it seeded
MODEL_INIT={}
TIME_STEP_NBR=100000
TIMESTEP=.01
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global variables are generally discouraged in Python due to their mutable nature and the potential for unexpected behavior. Consider encapsulating these variables within a class or function, or passing them as arguments where necessary. However, if these variables are constants and their values do not change throughout the program, their usage can be justified.

Comment on lines 336 to 353
def GetAllAvailableQties(self, time, pops):
assert self.single_hier_for_all_quantities == True # can't work otherwise

self.GetParticles(time, pops)
self.GetB(time)
self.GetE(time)
self.GetNi(time)
self.GetVi(time)

for pop in pops:
self.GetFlux(time, pop)
self.GetN(time, pop)
for fn in [self.GetB, self.GetE, self.GetNi, self.GetVi]:
try:
fn(time)
except:
pass

try:
self.GetParticles(time, pops)
for pop in pops:
self.GetFlux(time, pop)
self.GetN(time, pop)
except:
pass

return self.hier
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of a bare except: clause is generally discouraged as it can catch unexpected errors and make debugging more difficult. It's better to catch specific exceptions that you expect might be raised in the try block. If you're not sure what exceptions might be raised, you can catch Exception, which catches all non-exit exceptions. Also, it would be helpful to log or print the exception for debugging purposes.

-            except:
+            except Exception as e:
+                print(f"An error occurred: {e}")

Commitable suggestion (Beta)
Suggested change
def GetAllAvailableQties(self, time, pops):
assert self.single_hier_for_all_quantities == True # can't work otherwise
self.GetParticles(time, pops)
self.GetB(time)
self.GetE(time)
self.GetNi(time)
self.GetVi(time)
for pop in pops:
self.GetFlux(time, pop)
self.GetN(time, pop)
for fn in [self.GetB, self.GetE, self.GetNi, self.GetVi]:
try:
fn(time)
except:
pass
try:
self.GetParticles(time, pops)
for pop in pops:
self.GetFlux(time, pop)
self.GetN(time, pop)
except:
pass
return self.hier
def GetAllAvailableQties(self, time, pops):
assert self.single_hier_for_all_quantities == True # can't work otherwise
for fn in [self.GetB, self.GetE, self.GetNi, self.GetVi]:
try:
fn(time)
except Exception as e:
print(f"An error occurred: {e}")
try:
self.GetParticles(time, pops)
for pop in pops:
self.GetFlux(time, pop)
self.GetN(time, pop)
except Exception as e:
print(f"An error occurred: {e}")
return self.hier

Comment on lines +345 to +351
try:
self.GetParticles(time, pops)
for pop in pops:
self.GetFlux(time, pop)
self.GetN(time, pop)
except:
pass
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same comments apply to this try/except block. It's trying to call GetParticles, GetFlux, and GetN with the appropriate arguments, and if any of these calls fail, it simply passes and continues. This could potentially hide errors and lead to unexpected behavior.

Comment on lines +8 to +9
_USE_NINJA=True

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global variables can make code harder to understand and maintain. Consider encapsulating these variables in a class or passing them as arguments to functions where necessary.

Comment on lines +30 to +29
_globals = dict(servers=dict(), busy=0)
phare_runs_dir = Path(os.getcwd()) / "phare_runs" # can't contain period "."
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global variables are generally discouraged in Python due to their mutable nature and the potential for unexpected side effects. Consider encapsulating these in a class or passing them as function arguments where necessary.

Comment on lines +261 to +278
def advance(self, compare=False):
if not self.states["init"]:
self.init()

atomic_set(0, self.shr.name, 0, int(compare))
self._state_machine_set_per_simulation(SharedSimulationStateEnum.CAN_ADVANCE, 1)

if compare:
self._state_machine_set_per_simulation(SharedSimulationStateEnum.IS_BUSY, 1)
# we have to wait for all simulations to finish the current timestep
self.wait_for_simulations()
self.compare()

def compare(self):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advance method is doing a lot of work and could benefit from being broken down into smaller, more manageable methods. This would improve readability and maintainability.

Comment on lines 275 to 316
for i, simulation in enumerate(self.simulations):
assert (
_globals["servers"][self.starting_sim_id + i].exitcode is None
), "or it crashed early"

sim_times = self._state_machine_list_for_value(SharedSimulationStateEnum.CURRENT_TIME)

times = {
i: (0.0 + sim_times[i]) / 1e9 # :( it's an int so after decimal is dropped
for i, simulation in enumerate(self.simulations)
}

if len(times) < 2:
return

for i, simulation in enumerate(self.simulations):
for j in range(i + 1, len(times)):
if times[i] == times[j]:
run0 = Run(
build_diag_dir(self.starting_sim_id + i),
single_hier_for_all_quantities=True,
)
run1 = Run(
build_diag_dir(self.starting_sim_id + j),
single_hier_for_all_quantities=True,
)

print(
f"comparing {self.starting_sim_id + i} and {self.starting_sim_id + j} at time {times[i]}"
)
assert hierarchy_compare(
run0.GetAllAvailableQties(times[i], []),
run1.GetAllAvailableQties(times[i], []),
)
print("OK!")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compare method is doing a lot of work and could benefit from being broken down into smaller, more manageable methods. This would improve readability and maintainability.

Comment on lines +333 to +354
def __del__(self):
self.kill()

def kill(self):
time.sleep(2)

if self.shr:
self.shr.close()
self.shr.unlink()
self.shr = None

stop_servers()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kill method is missing error handling for potential exceptions that could occur during the shared memory and server termination.

Comment on lines +1 to +46
import sys
import shutil
from pathlib import Path

from pyphare.simulator.simulators import Simulators
from tools.python3 import pushd, cmake, git
from tests.functional.alfven_wave import alfven_wave1d

# we want it seeded
alfven_wave1d.MODEL_INIT={"seed": 1337}
alfven_wave1d.TIME_STEP_NBR = 10

if len(sys.argv) != 3:
print("Incorrect input arguments, expects two branch names", sys.argv)
exit(0)

# register exit handler
git.git_branch_reset_at_exit()

branches = sys.argv[1:]

# check branches exist
for branch in branches:
git.checkout(branch)

build_dir = Path("build")

for branch in branches:
b = (build_dir / branch)
b.mkdir(parents=True, exist_ok=True)

git.checkout(branch)
with pushd(b):
cmake.config("../..")

for branch in branches:
git.checkout(branch)
with pushd(build_dir / branch):
cmake.build()

simulators = Simulators()
for branch in branches:
# alfven_wave1d.config() will be identical here even if different on branches
# as it is already parsed before we change branch
simulators.register(alfven_wave1d.config(), build_dir=str(build_dir / branch))
simulators.run(compare=True)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script seems to be well-structured and follows good practices. However, there are a few points to consider:

  1. Error handling: The script doesn't handle exceptions that might occur during the execution of the script. For example, if the git checkout, cmake configuration, or build fails, the script will continue to execute, which might lead to unexpected behavior or results. Consider adding try/except blocks around these operations to handle potential errors and exit the script gracefully.

  2. User feedback: The script could benefit from more user feedback. For example, printing a message when each major step begins and ends (e.g., checking out branches, building, running simulations) would make it easier for users to understand what the script is doing.

  3. Exit code: In line 15, the script exits with a status code of 0 even when the input arguments are incorrect. By convention, a non-zero exit status indicates that the program encountered an error. Consider changing this to a non-zero value.

Here's a diff with the suggested changes:

if len(sys.argv) != 3:
    print("Incorrect input arguments, expects two branch names", sys.argv)
-    exit(0)
+    exit(1)

# register exit handler
git.git_branch_reset_at_exit()

branches  = sys.argv[1:]

# check branches exist
for branch in branches:
+    print(f"Checking out branch {branch}...")
    try:
        git.checkout(branch)
    except Exception as e:
        print(f"Failed to checkout branch {branch}: {e}")
        exit(1)
+    print(f"Successfully checked out branch {branch}")

build_dir = Path("build")

for branch in branches:
    b = (build_dir / branch)
    b.mkdir(parents=True, exist_ok=True)

    git.checkout(branch)
    with pushd(b):
+        print(f"Configuring CMake for branch {branch}...")
        try:
            cmake.config("../..")
        except Exception as e:
            print(f"Failed to configure CMake for branch {branch}: {e}")
            exit(1)
+        print(f"Successfully configured CMake for branch {branch}")

for branch in branches:
    git.checkout(branch)
    with pushd(build_dir / branch):
+        print(f"Building project for branch {branch}...")
        try:
            cmake.build()
        except Exception as e:
            print(f"Failed to build project for branch {branch}: {e}")
            exit(1)
+        print(f"Successfully built project for branch {branch}")

simulators = Simulators()
for branch in branches:
    # alfven_wave1d.config() will be identical here even if different on branches
    #  as it is already parsed before we change branch
+    print(f"Registering simulation for branch {branch}...")
    simulators.register(alfven_wave1d.config(), build_dir=str(build_dir / branch))
+    print(f"Successfully registered simulation for branch {branch}")
+print("Running simulations...")
simulators.run(compare=True)
+print("Simulations completed")

Commitable suggestion (Beta)
Suggested change
import sys
import shutil
from pathlib import Path
from pyphare.simulator.simulators import Simulators
from tools.python3 import pushd, cmake, git
from tests.functional.alfven_wave import alfven_wave1d
# we want it seeded
alfven_wave1d.MODEL_INIT={"seed": 1337}
alfven_wave1d.TIME_STEP_NBR = 10
if len(sys.argv) != 3:
print("Incorrect input arguments, expects two branch names", sys.argv)
exit(0)
# register exit handler
git.git_branch_reset_at_exit()
branches = sys.argv[1:]
# check branches exist
for branch in branches:
git.checkout(branch)
build_dir = Path("build")
for branch in branches:
b = (build_dir / branch)
b.mkdir(parents=True, exist_ok=True)
git.checkout(branch)
with pushd(b):
cmake.config("../..")
for branch in branches:
git.checkout(branch)
with pushd(build_dir / branch):
cmake.build()
simulators = Simulators()
for branch in branches:
# alfven_wave1d.config() will be identical here even if different on branches
# as it is already parsed before we change branch
simulators.register(alfven_wave1d.config(), build_dir=str(build_dir / branch))
simulators.run(compare=True)
import sys
import shutil
from pathlib import Path
from pyphare.simulator.simulators import Simulators
from tools.python3 import pushd, cmake, git
from tests.functional.alfven_wave import alfven_wave1d
# we want it seeded
alfven_wave1d.MODEL_INIT={"seed": 1337}
alfven_wave1d.TIME_STEP_NBR = 10
if len(sys.argv) != 3:
print("Incorrect input arguments, expects two branch names", sys.argv)
exit(1)
# register exit handler
git.git_branch_reset_at_exit()
branches = sys.argv[1:]
# check branches exist
for branch in branches:
print(f"Checking out branch {branch}...")
try:
git.checkout(branch)
except Exception as e:
print(f"Failed to checkout branch {branch}: {e}")
exit(1)
print(f"Successfully checked out branch {branch}")
build_dir = Path("build")
for branch in branches:
b = (build_dir / branch)
b.mkdir(parents=True, exist_ok=True)
git.checkout(branch)
with pushd(b):
print(f"Configuring CMake for branch {branch}...")
try:
cmake.config("../..")
except Exception as e:
print(f"Failed to configure CMake for branch {branch}: {e}")
exit(1)
print(f"Successfully configured CMake for branch {branch}")
for branch in branches:
git.checkout(branch)
with pushd(build_dir / branch):
print(f"Building project for branch {branch}...")
try:
cmake.build()
except Exception as e:
print(f"Failed to build project for branch {branch}: {e}")
exit(1)
print(f"Successfully built project for branch {branch}")
simulators = Simulators()
for branch in branches:
print(f"Registering simulation for branch {branch}...")
simulators.register(alfven_wave1d.config(), build_dir=str(build_dir / branch))
print(f"Successfully registered simulation for branch {branch}")
print("Running simulations...")
simulators.run(compare=True)
print("Simulations completed")

Comment on lines 182 to 190
L0L1_datahier = check_hier(get_hier(L0L1_diags))
extra_collections = []
errors = test.base_test_overlaped_fields_are_equal(L0L1_datahier, new_time)
# errors = test.base_test_field_level_ghosts_via_subcycles_and_coarser_interpolation(L0_datahier, L0L1_datahier)
errors = (
test.base_test_field_level_ghosts_via_subcycles_and_coarser_interpolation(
L0_datahier, L0L1_datahier
)
)
print(f"errors {type(errors)}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable errors is reassigned on line 186, which makes the assignment on line 184 redundant. If the function base_test_overlaped_fields_are_equal has side effects that are needed, consider calling it without assigning the result to errors. If not, line 184 can be removed.

- errors = test.base_test_overlaped_fields_are_equal(L0L1_datahier, new_time)
  errors = (
      test.base_test_field_level_ghosts_via_subcycles_and_coarser_interpolation(
          L0_datahier, L0L1_datahier
      )
  )

Commitable suggestion (Beta)
Suggested change
L0L1_datahier = check_hier(get_hier(L0L1_diags))
extra_collections = []
errors = test.base_test_overlaped_fields_are_equal(L0L1_datahier, new_time)
# errors = test.base_test_field_level_ghosts_via_subcycles_and_coarser_interpolation(L0_datahier, L0L1_datahier)
errors = (
test.base_test_field_level_ghosts_via_subcycles_and_coarser_interpolation(
L0_datahier, L0L1_datahier
)
)
print(f"errors {type(errors)}")
L0L1_datahier = check_hier(get_hier(L0L1_diags))
extra_collections = []
errors = (
test.base_test_field_level_ghosts_via_subcycles_and_coarser_interpolation(
L0_datahier, L0L1_datahier
)
)
print(f"errors {type(errors)}")

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 6

Configuration used: .coderabbit.yaml

Commits Files that changed from the base of the PR and between 2e9c825 and 4181929.
Files selected for processing (13)
  • pyphare/pyphare/cpp/init.py (1 hunks)
  • pyphare/pyphare/pharesee/hierarchy.py (3 hunks)
  • pyphare/pyphare/pharesee/particles.py (1 hunks)
  • pyphare/pyphare/pharesee/run.py (1 hunks)
  • pyphare/pyphare/simulator/simulator.py (1 hunks)
  • pyphare/pyphare/simulator/simulators.py (1 hunks)
  • tests/functional/alfven_wave/alfven_wave1d.py (4 hunks)
  • tools/build_compare_2_branches.py (1 hunks)
  • tools/build_compare_2_branches.sh (1 hunks)
  • tools/build_compare_top_2_commits.py (1 hunks)
  • tools/build_compare_top_2_commits.sh (1 hunks)
  • tools/python3/cmake.py (1 hunks)
  • tools/python3/git.py (2 hunks)
Files skipped from review due to trivial changes (1)
  • tools/build_compare_top_2_commits.sh
Files skipped from review as they are similar to previous changes (9)
  • pyphare/pyphare/pharesee/hierarchy.py
  • pyphare/pyphare/pharesee/particles.py
  • pyphare/pyphare/pharesee/run.py
  • pyphare/pyphare/simulator/simulator.py
  • tests/functional/alfven_wave/alfven_wave1d.py
  • tools/build_compare_2_branches.py
  • tools/build_compare_2_branches.sh
  • tools/build_compare_top_2_commits.py
  • tools/python3/git.py
Additional comments: 12
pyphare/pyphare/cpp/__init__.py (1)
  • 4-26: The introduction of a caching mechanism for the C++ module import using the _globals dictionary is a good performance optimization. It reduces the overhead of repeatedly importing the same module. However, ensure that this change does not introduce any side effects in a multi-threaded environment where the import might be expected to be fresh or different across threads.
pyphare/pyphare/simulator/simulators.py (8)
  • 28-29: The previous comment about global variables is still valid. Consider encapsulating these in a class or passing them as function arguments where necessary.

  • 43-47: The previous comment about wrapping the creation of shared memory blocks in a try/except block is still valid to handle potential exceptions, such as insufficient memory.

  • 53-57: The previous comment about thread safety is still valid. Consider using a context manager (with lock:) to ensure the lock is properly released even if an error occurs.

  • 67-97: The previous comment about breaking down the state_machine function into smaller, more manageable functions is still valid to improve readability and maintainability.

  • 113-119: The previous comment about error handling for the case where the Simulator initialization fails is still valid. Consider adding a try/except block to handle potential exceptions.

  • 122-125: The previous comment about breaking down the advance method into smaller, more manageable methods is still valid to improve readability and maintainability.

  • 346-354: The previous comment about missing error handling for potential exceptions that could occur during the shared memory and server termination in the kill method is still valid.

  • 146-146: It's good to see an assertion here to ensure that a server for a given sim_id does not already exist before starting a new process. This helps prevent accidental overwrites or conflicts.

tools/python3/cmake.py (3)
  • 8-11:
    The introduction of _USE_NINJA and _SAMRAI_DIR as global variables centralizes the build configuration settings, which can simplify the build process and ensure consistency across the build functions.

  • 13-25:
    The make_config_str function now uses the _USE_NINJA and _SAMRAI_DIR global variables. Verify that all calls to this function have been updated to reflect the removal of the use_ninja parameter and the new behavior of the samrai_dir parameter.

  • 28-32:
    The config function now uses the _USE_NINJA and _SAMRAI_DIR global variables. Verify that all calls to this function have been updated to reflect the removal of the use_ninja parameter.

Comment on lines +157 to +164
for i in range(1, try_count + 1):
time.sleep(0.5)
try:
assert servers[sim_id].exitcode is None # or it crashed/exited early
return
except Exception as e:
if i == try_count:
raise e
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry logic here is a good addition for robustness, but it should include a backoff strategy to avoid hammering the system with rapid retries. Also, consider logging the exception details for better debugging.

+ for i in range(1, try_count + 1):
+     time.sleep(0.5 * i)  # backoff strategy
+     try:
+         assert servers[sim_id].exitcode is None  # or it crashed/exited early
+         return
+     except Exception as e:
+         if i == try_count:
+             logging.exception(f"Server process for sim_id {sim_id} failed to start.")
+             raise e

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
for i in range(1, try_count + 1):
time.sleep(0.5)
try:
assert servers[sim_id].exitcode is None # or it crashed/exited early
return
except Exception as e:
if i == try_count:
raise e
for i in range(1, try_count + 1):
time.sleep(0.5 * i) # backoff strategy
try:
assert servers[sim_id].exitcode is None # or it crashed/exited early
return
except Exception as e:
if i == try_count:
logging.exception(f"Server process for sim_id {sim_id} failed to start.")
raise e

Comment on lines +180 to +346

lock.release()
existing_shm.close()

def wait_for_simulations(self):
# we have to wait for all simulations to finish the current timestep
while True:
if (
np.sum(
self._state_machine_list_for_value(
SharedSimulationStateEnum.IS_BUSY
)
)
== 0
):
return
time.sleep(1)

def advance(self, compare=False):
if not self.states["init"]:
self.init()

atomic_set(0, self.shr.name, 0, int(compare))
self._state_machine_set_per_simulation(SharedSimulationStateEnum.CAN_ADVANCE, 1)

if compare:
self._state_machine_set_per_simulation(SharedSimulationStateEnum.IS_BUSY, 1)
# we have to wait for all simulations to finish the current timestep
self.wait_for_simulations()
self.compare()

def compare(self):
for i, simulation in enumerate(self.simulations):
assert (
_globals["servers"][self.starting_sim_id + i].exitcode is None
), "or it crashed early"

sim_times = self._state_machine_list_for_value(
SharedSimulationStateEnum.CURRENT_TIME
)

times = {
i: (0.0 + sim_times[i]) / 1e9 # :( it's an int so after decimal is dropped
for i, simulation in enumerate(self.simulations)
}

if len(times) < 2:
return

for i, simulation in enumerate(self.simulations):
for j in range(i + 1, len(times)):
if times[i] == times[j]:
run0 = Run(
build_diag_dir(self.starting_sim_id + i),
single_hier_for_all_quantities=True,
)
run1 = Run(
build_diag_dir(self.starting_sim_id + j),
single_hier_for_all_quantities=True,
)

print(
f"comparing {self.starting_sim_id + i} and {self.starting_sim_id + j} at time {times[i]}"
)
assert hierarchy_compare(
run0.GetAllAvailableQties(times[i], []),
run1.GetAllAvailableQties(times[i], []),
)
print("OK!")

def run(self, compare=False):
if not self.states["init"]:
self.init()

while self.simulations:
self.advance(compare=compare)

is_busy = self._state_machine_list_for_value(
SharedSimulationStateEnum.IS_BUSY
)
is_finished = self._state_machine_list_for_value(
SharedSimulationStateEnum.IS_FINISHED
)

# trim finished simulations
self.simulations = [
sim
for i, sim in enumerate(self.simulations)
if is_busy[i] or not is_finished[i]
]

print("running simulations", len(self.simulations))
time.sleep(1)

self.kill()

def __del__(self):
self.kill()

def kill(self):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Simulators class encapsulates the management of multiple simulations, which is a good design choice for maintainability and modularity. However, the class is quite large and does a lot of work. Consider breaking down some of its responsibilities into smaller, more focused classes or helper functions.

Comment on lines +347 to +352
time.sleep(2)

if self.shr:
self.shr.close()
self.shr.unlink()
self.shr = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kill method should handle exceptions that may occur when closing or unlinking the shared memory. This is important to ensure that resources are cleaned up properly even if an error occurs.

+ def kill(self):
+     time.sleep(2)
+     try:
+         if self.shr:
+             self.shr.close()
+             self.shr.unlink()
+             self.shr = None
+     except Exception as e:
+         logging.exception("An error occurred while cleaning up shared memory.")
+         raise e

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
time.sleep(2)
if self.shr:
self.shr.close()
self.shr.unlink()
self.shr = None
def kill(self):
time.sleep(2)
try:
if self.shr:
self.shr.close()
self.shr.unlink()
self.shr = None
except Exception as e:
logging.exception("An error occurred while cleaning up shared memory.")
raise e

Comment on lines +35 to +36
def build(threads=1):
run("ninja" if _USE_NINJA else f"make -j{threads}", capture_output=False)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 NOTE
Codebase verification is a beta feature.

The search results show that the build function is called in several scripts. The calls in tools/build_compare_top_2_commits.py and tools/build_compare_2_branches.py have been updated correctly and do not pass any parameters. However, the call in tools/bench/functional/run_gen_plots.py still uses the use_ninja parameter, which is no longer valid after the update to the build function. This needs to be addressed.

  • The call to cmake.build() in tools/bench/functional/run_gen_plots.py still uses the use_ninja parameter, which has been removed from the build function definition. This call should be updated to remove the use_ninja parameter.
Analysis chain

The build function now uses the _USE_NINJA global variable directly. Verify that all calls to this function have been updated to reflect the removal of the use_ninja parameter.


The ast-grep command used to search for calls to the build function did not produce any output. This could indicate that there are no calls to this function, or the pattern used might not have been able to match the function calls correctly. To ensure that we have the correct information, I will generate a new shell script with a revised pattern to search for the build function calls again.


The rg command failed due to the use of an invalid --python flag. I will correct the command to search for calls to the build function without specifying the language, as ripgrep searches through text files by default and will include Python files in its search.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Search for calls to `build` and ensure they have been updated.
ast-grep --lang python --pattern $'build($_)'

Length of output: 46



Script:

#!/bin/bash
# Search for calls to `build` function in the codebase with a more inclusive pattern.
rg --python 'build\('

Length of output: 434



Script:

#!/bin/bash
# Search for calls to `build` function in the codebase.
rg 'build\('

Length of output: 426


if _globals["module"] is None:
_globals["module"] = get_module()
print("Loaded C++ module", _globals["module"].__file__)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider if the print statement indicating the loading of the C++ module is appropriate for production code. It may be beneficial for debugging but could clutter the output in a production environment. If it's intended for debugging purposes only, it might be better to use a logging framework with appropriate log levels.

Comment on lines +19 to +21
return importlib.import_module("pybindlibs.cpp_dbg")
except ImportError as err:
return importlib.import_module("pybindlibs.cpp")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_module function swallows the exception message from ImportError. It might be useful to log the error message for debugging purposes, especially if the fallback import fails as well. Consider adding logging for the exception or re-raising it with additional context.

  try:
      return importlib.import_module("pybindlibs.cpp_dbg")
+ except ImportError as err:
+     logging.error(f"Failed to import pybindlibs.cpp_dbg: {err}")
      return importlib.import_module("pybindlibs.cpp")

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
return importlib.import_module("pybindlibs.cpp_dbg")
except ImportError as err:
return importlib.import_module("pybindlibs.cpp")
try:
return importlib.import_module("pybindlibs.cpp_dbg")
except ImportError as err:
logging.error(f"Failed to import pybindlibs.cpp_dbg: {err}")
return importlib.import_module("pybindlibs.cpp")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant