Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to Remote BMI & containerized model #12

Merged
merged 10 commits into from
Jan 8, 2025
Merged

Move to Remote BMI & containerized model #12

merged 10 commits into from
Jan 8, 2025

Conversation

BSchilperoort
Copy link
Member

@BSchilperoort BSchilperoort commented Nov 27, 2024

This PR moves the plugin to use the new Remote BMI, and uses a containerized model.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@BSchilperoort
Copy link
Member Author

@pauwiersma would you be willing/able to try out this PR? Wflow.jl can now run containerized :)

Do note that ewatercycle 2.4.0 might not be released yet so you'll have to install that from the branch. If that PR is merged the new eWaterCycle should be available!

@BSchilperoort BSchilperoort marked this pull request as ready for review December 2, 2024 13:01
@pauwiersma
Copy link

Hi @BSchilperoort , that's great! I've created a conda environment with this branch + ewatercycle2.4. First there was a problem with irisapi which complained about python3.12 but when I tried again it worked somehow.
When I try to initialize the model I get error 104: Connection reset by peer, without any useful indiciations of where it goes wrong. ChatGPT is telling me it's a docker problem, so I made sure to point to docker root_dir (/var/snap/docker/common/var-lib-docker) in the ewatercycle config file, but that doesn't solve the problem. I can continue searching later this week, but perhaps you have an idea already.

@BSchilperoort
Copy link
Member Author

I made sure to point to docker root_dir (/var/snap/docker/common/var-lib-docker) in the ewatercycle config file,

The only config option related to docker in the ewatercycle.yaml file is to specify the container engine: container_engine: docker.

Can you run the image manually in your terminal?

docker run ghcr.io/ewatercycle/wflowjl-remotebmi:0.2.0

@BSchilperoort
Copy link
Member Author

I've created a conda environment with this branch + ewatercycle2.4. First there was a problem with irisapi which complained about python3.12 but when I tried again it worked somehow.

Next time you can use the conda lock file available on the ewatercycle repo to create an environment. That should ensure everything installs correctly.

@sverhoeven sverhoeven self-requested a review January 7, 2025 08:49
Dockerfile Outdated

# Expose port and start server
EXPOSE 50051
CMD ["julia" , "-e", "using Wflow; import RemoteBMI.Server: run_bmi_server; port = parse(Int, get(ENV, \"BMI_PORT\", \"50051\")); run_bmi_server(Wflow.Model, \"0.0.0.0\", port)"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run the image with

docker run -ti --rm ghcr.io/ewatercycle/wflowjl-remotebmi:0.2.0
Precompiling Wflow...
  116 dependencies successfully precompiled in 48 seconds
Precompiling RemoteBMI...
  23 dependencies successfully precompiled in 16 seconds. 30 already precompiled.
[ Info: Listening on: 0.0.0.0:50051, thread id: 1

I did not expect the precompiling to happen here, that should be done during image build

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instantly works for me

>>> docker run -it --rm ghcr.io/ewatercycle/wflowjl-remotebmi:0.2.0
[ Info: Listening on: 0.0.0.0:50051, thread id: 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to model.setup use

docker --version
Docker version 27.4.1, build b9d17ea
docker run -ti --rm -u 1000  ghcr.io/ewatercycle/wflowjl-remotebmi:0.2.0 
Precompiling Wflow...

  Progress [>                                        ]  0/116
  ◑ Unicode
  ◐ OrderedCollections
  ◒ NetworkOptions
  ◑ UnPack
  ◒ FieldMetadata
  ◐ Serialization
  ◑ UUIDs
  ◓ LRUCache
  ◓ Logging
  ◐ SIMDTypes
  ◒ Zlib_jll
  ◑ MbedTLS_jll
  ◑ Inflate
  ◑ AbstractTrees
  ◐ Glob
  ◑ ArgTools
  ◐ Statistics
  ◑ IfElse
  ◒ Base64
  ◐ CommonWorldInvalidations
  ◐ CompilerSupportLibraries_jll
  ◑ MozillaCACerts_jll
  ◒ Unicode
  ◓ OrderedCollections
  ◐ NetworkOptions
  ◒ UnPack
  ◐ FieldMetadata
  ◓ Serialization
  ◒ UUIDs
  ◑ LRUCache
  ◑ Logging
  ◓ SIMDTypes
  ◐ Zlib_jll
  ◒ MbedTLS_jll
  ◒ Inflate
  ◒ AbstractTrees
  ◓ Glob
  ◒ ArgTools
  ◓ Statistics
  ◒ IfElse
  ◐ Base64
  ◓ CommonWorldInvalidations
  ◓ CompilerSupportLibraries_jll
  ◒ MozillaCACerts_jll
  ◒ Mmap
  ◓ SuiteSparse_jll
  ◐ nghttp2_jll
  ◐ BasicModelInterface
  ◑ p7zip_jll
  ◓ StaticArraysCore
  ◓ OffsetArrays
  ◒ ManualMemory
  ◒ Markdown
  ◑ DelimitedFiles
  ◑ ThreadingUtilities
  ◓ CpuId
  ◒ MacroTools
  ◒ InteractiveUtils
  ◓ SimpleTraits
ERROR: IOError: open("/julia/pkg/compiled/v1.11/Wflow/hSkFR_yCiv6.ji.pidfile", 194, 292): permission denied (EACCES)
Stacktrace:
  [1] uv_error
    @ ./libuv.jl:106 [inlined]
  [2] open(path::String, flags::UInt16, mode::UInt16)
    @ Base.Filesystem ./filesystem.jl:190
  [3] tryopen_exclusive(path::String, mode::UInt16)
    @ FileWatching.Pidfile ~/usr/local/julia/share/julia/stdlib/v1.11/FileWatching/src/pidfile.jl:217
  [4] open_exclusive(path::String; mode::UInt16, poll_interval::Int64, wait::Bool, stale_age::Int64, refresh::Float64)
    @ FileWatching.Pidfile ~/usr/local/julia/share/julia/stdlib/v1.11/FileWatching/src/pidfile.jl:244
  [5] open_exclusive
    @ ~/usr/local/julia/share/julia/stdlib/v1.11/FileWatching/src/pidfile.jl:237 [inlined]
  [6] mkpidlock(at::String, pid::Int32; stale_age::Int64, refresh::Float64, kwopts::@Kwargs{wait::Bool})
    @ FileWatching.Pidfile ~/usr/local/julia/share/julia/stdlib/v1.11/FileWatching/src/pidfile.jl:68
  [7] mkpidlock
    @ ~/usr/local/julia/share/julia/stdlib/v1.11/FileWatching/src/pidfile.jl:63 [inlined]
  [8] mkpidlock(f::Base.var"#1082#1083"{Base.PkgId}, at::String, pid::Int32; kwopts::@Kwargs{stale_age::Int64, wait::Bool})
    @ FileWatching.Pidfile ~/usr/local/julia/share/julia/stdlib/v1.11/FileWatching/src/pidfile.jl:93
  [9] #mkpidlock#6
    @ ~/usr/local/julia/share/julia/stdlib/v1.11/FileWatching/src/pidfile.jl:90 [inlined]
 [10] trymkpidlock(::Function, ::Vararg{Any}; kwargs::@Kwargs{stale_age::Int64})
    @ FileWatching.Pidfile ~/usr/local/julia/share/julia/stdlib/v1.11/FileWatching/src/pidfile.jl:116
 [11] #invokelatest#2
    @ ./essentials.jl:1057 [inlined]
 [12] invokelatest
    @ ./essentials.jl:1052 [inlined]
 [13] maybe_cachefile_lock(f::Base.var"#1082#1083"{Base.PkgId}, pkg::Base.PkgId, srcpath::String; stale_age::Int64)
    @ Base ./loading.jl:3609
 [14] maybe_cachefile_lock
    @ ./loading.jl:3606 [inlined]
 [15] _require(pkg::Base.PkgId, env::String)
    @ Base ./loading.jl:2488
 [16] __require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base ./loading.jl:2315
 [17] #invoke_in_world#3
    @ ./essentials.jl:1089 [inlined]
 [18] invoke_in_world
    @ ./essentials.jl:1086 [inlined]
 [19] _require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base ./loading.jl:2302
 [20] macro expansion
    @ ./loading.jl:2241 [inlined]
 [21] macro expansion
    @ ./lock.jl:273 [inlined]
 [22] __require(into::Module, mod::Symbol)
    @ Base ./loading.jl:2198
 [23] #invoke_in_world#3
    @ ./essentials.jl:1089 [inlined]
 [24] invoke_in_world
    @ ./essentials.jl:1086 [inlined]
 [25] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:2191

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I build the image locally it works.

docker build -t ghcr.io/ewatercycle/wflowjl-remotebmi:0.2.0 .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding

ENV JULIA_CPU_TARGET="generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)"

and creating a sysimage with PackageCompiler is supposed to fix having to recompile the julia packages.

However, it gives Stefan the following error:

ERROR: Unable to find compatible target in cached code image.
Target 0 (tigerlake): Rejecting this target due to use of runtime-disabled features

Running docker inside WSL.

Copy link
Member

@sverhoeven sverhoeven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me after local docker build.
Was able to run demo notebook.

"metadata": {},
"outputs": [],
"source": [
"from ewatercycle.base.parameter_set import ParameterSet\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did this come from. Would nice to add Markdown cell with something like

The `wflow_sbm + kinematic wave` parameter set can be downloaded from https://deltares.github.io/Wflow.jl/dev/getting_started/download_example_models.html .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added link in 4d5b0b7

bmi_image: ContainerImage = ContainerImage(
"ghcr.io/ewatercycle/wflowjl-remotebmi:0.2.0"
)
protocol: Literal["grpc", "openapi"] = "openapi"

def get_latlon_grid(self, name: str) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the methods below still needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep they are, unstructured grid has x,y as lat,lon and start_time = 0 and time units is 's'.

@BSchilperoort
Copy link
Member Author

Thanks for the review and testing @sverhoeven !

@BSchilperoort BSchilperoort merged commit 07adbef into main Jan 8, 2025
1 check passed
@BSchilperoort BSchilperoort deleted the remotebmi branch January 8, 2025 14:38
This was referenced Jan 8, 2025
@pauwiersma
Copy link

Thanks both, and sorry I couldn't be of more help. I will try to apply this PR soon to my environment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants