-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document how to support different distributed worker backends #11
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #11 +/- ##
==========================================
- Coverage 87.83% 87.78% -0.05%
==========================================
Files 11 11
Lines 2055 2055
==========================================
- Hits 1805 1804 -1
- Misses 250 251 +1 ☔ View full report in Codecov by Sentry. |
I realized a downside of not tying the preference to a package is that it won't affect precompilation... New approach is to make the packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
```julia | ||
module Foo | ||
|
||
import Dependency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import Dependency | |
# Load a dependency which also supports Distributed/DistributedNext | |
import Dependency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 03bb8db.
docs/src/support-preferences.md
Outdated
|
||
Users will then be able to call | ||
e.g. `Foo.set_distributed_package!("DistributedNext")`. Note that | ||
`Foo.set_distributed_package!()` should also set the preferences of any dependencies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`Foo.set_distributed_package!()` should also set the preferences of any dependencies | |
`Foo.set_distributed_package!` should also set the preferences of any dependencies |
To reduce confusion about whether one can actually call Foo.set_distributed_package!()
like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point, fixed in 03bb8db.
3c28c5e
to
03bb8db
Compare
According to a stacktrace from a hung CI job this task was causing the process to hang before exiting: ```julia InterruptException() _jl_mutex_unlock at C:/workdir/src\threading.c:1012 jl_mutex_unlock at C:/workdir/src\julia_locks.h:80 [inlined] ijl_task_get_next at C:/workdir/src\scheduler.c:458 poptask at .\task.jl:1163 wait at .\task.jl:1172 task_done_hook at .\task.jl:839 jfptr_task_done_hook_98752.1 at C:\hostedtoolcache\windows\julia\nightly\x64\lib\julia\sys.dll (unknown line) jl_apply at C:/workdir/src\julia.h:2233 [inlined] jl_finish_task at C:/workdir/src\task.c:338 start_task at C:/workdir/src\task.c:1274 From worker 82: fatal: error thrown and no exception handler available.Unhandled Task ERROR: InterruptException: Stacktrace: [1] poptask(W::Base.IntrusiveLinkedListSynchronized{Task}) @ Base .\task.jl:1163 [2] wait() @ Base .\task.jl:1172 [3] wait(c::Base.GenericCondition{ReentrantLock}; first::Bool) @ Base .\condition.jl:141 [4] wait @ .\condition.jl:136 [inlined] [5] put_buffered(c::Channel{Any}, v::Int64) @ Base .\channels.jl:420 [6] put!(c::Channel{Any}, v::Int64) @ Base .\channels.jl:398 [7] put!(rv::DistributedNext.RemoteValue, args::Int64) @ DistributedNext D:\a\DistributedNext.jl\DistributedNext.jl\src\remotecall.jl:703 [8] (::DistributedNext.var"#create_worker##11#create_worker##12"{DistributedNext.RemoteValue, Float64})() @ DistributedNext D:\a\DistributedNext.jl\DistributedNext.jl\src\cluster.jl:721 ``` Replaced it with a call to `timedwait()`, which has the advantage of being a lot simpler than an extra task.
According to a stacktrace from a hung CI job this task was causing the process to hang before exiting: ```julia InterruptException() _jl_mutex_unlock at C:/workdir/src\threading.c:1012 jl_mutex_unlock at C:/workdir/src\julia_locks.h:80 [inlined] ijl_task_get_next at C:/workdir/src\scheduler.c:458 poptask at .\task.jl:1163 wait at .\task.jl:1172 task_done_hook at .\task.jl:839 jfptr_task_done_hook_98752.1 at C:\hostedtoolcache\windows\julia\nightly\x64\lib\julia\sys.dll (unknown line) jl_apply at C:/workdir/src\julia.h:2233 [inlined] jl_finish_task at C:/workdir/src\task.c:338 start_task at C:/workdir/src\task.c:1274 From worker 82: fatal: error thrown and no exception handler available.Unhandled Task ERROR: InterruptException: Stacktrace: [1] poptask(W::Base.IntrusiveLinkedListSynchronized{Task}) @ Base .\task.jl:1163 [2] wait() @ Base .\task.jl:1172 [3] wait(c::Base.GenericCondition{ReentrantLock}; first::Bool) @ Base .\condition.jl:141 [4] wait @ .\condition.jl:136 [inlined] [5] put_buffered(c::Channel{Any}, v::Int64) @ Base .\channels.jl:420 [6] put!(c::Channel{Any}, v::Int64) @ Base .\channels.jl:398 [7] put!(rv::DistributedNext.RemoteValue, args::Int64) @ DistributedNext D:\a\DistributedNext.jl\DistributedNext.jl\src\remotecall.jl:703 [8] (::DistributedNext.var"#create_worker##11#create_worker##12"{DistributedNext.RemoteValue, Float64})() @ DistributedNext D:\a\DistributedNext.jl\DistributedNext.jl\src\cluster.jl:721 ``` Replaced it with a call to `timedwait()`, which has the advantage of being a lot simpler than an extra task.
This adds some docs about how packages can support both Distributed and DistributedNext. After some playing around, I think I lean towards having a global preference rather than a per-project preference so that users don't have to worry about setting preferences for their dependencies (motivating example: Dagger and MemPool). I think it's ok to standardize on a single top-level name (
distributed-library
) but another alternative would be tying it to a package like MPI.jl does with MPIPreferences.jl.I went for a more flexible
name
preference rather than a booleanuse-distributednext
because I figure that this could be used for other distributed packages like Malt.jl (CC @fonsp). e.g. if a package wants to support Distributed, DistributedNext, and Malt, they could do something like: