Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mention {mirai} in {future} docs #111

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Mention {mirai} in {future} docs #111

wants to merge 1 commit into from

Conversation

jcheng5
Copy link
Member

@jcheng5 jcheng5 commented Nov 25, 2024

Sorry @shikokuchuo, I just realized I had these changes sitting locally, uncommitted. I thought I had published this ages ago.

@shikokuchuo
Copy link
Member

Thanks Joe!

My first thoughts on those bullets are:

  1. mirai actually has a general interface for launching remote workers and is being used on Slurm clusters, for example this community-contributed example. I know that future is designed to be a general API, but I'm not sure that actually translates into it being able to be used in more places.
  2. I think the key here is that mirai is queued, so nested topologies automatically work without having to carefully manage them like future - so this is definitely a positive for mirai, but doesn't quite read as such!

I also think it's definitely worth mentioning that:

  • mirai promises are event-driven rather than polling (much more efficient, lower latency)

The following may tie into 4 above, that mirai is inherently queued:

  • mirai is always non-blocking and doesn't require the use of future_promise()

Hope the above makes sense for you. I'd be happy to suggest wording, but I also don't want to put words in your mouth! Let me know what you prefer or if you have any questions on the above.

@jcheng5
Copy link
Member Author

jcheng5 commented Nov 26, 2024

Please feel free to edit. I don't even know what I wrote, this was months ago!

@shikokuchuo
Copy link
Member

Please feel free to edit. I don't even know what I wrote, this was months ago!

Thanks Joe, I've gone ahead and added suggestions. Please feel free to re-word any of them. If there's anything I can make clearer for you do feel free to ask!

While this article and others on this site focus on the `future` package, there's an up and coming package called [`mirai`](https://shikokuchuo.net/mirai/) that you may want to consider instead.
Here are some factors to consider as you choose between the two.

1. The `future` package tries hard to automatically infer what variables and packages you need from the main R package, and makes those available to the child process. `mirai` doesn't try to do this for you; you need to pass in whatever data you need explicitly, and make `library()` calls explicitly inside of your inner code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. The `future` package tries hard to automatically infer what variables and packages you need from the main R package, and makes those available to the child process. `mirai` doesn't try to do this for you; you need to pass in whatever data you need explicitly, and make `library()` calls explicitly inside of your inner code.
1. The `future` package tries hard to automatically infer what variables and packages you need from the main R package, and makes those available to the child process. `mirai` doesn't try to do this for you; you need to pass in whatever data you need explicitly, and make package-namespaced calls explicitly inside of your inner code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know library() is probably clearer as a concept, but it's not good practice for users to be actually doing that inside a mirai() call!

Here are some factors to consider as you choose between the two.

1. The `future` package tries hard to automatically infer what variables and packages you need from the main R package, and makes those available to the child process. `mirai` doesn't try to do this for you; you need to pass in whatever data you need explicitly, and make `library()` calls explicitly inside of your inner code.
2. `mirai` is very fast; it's much faster than `future` at starting up and has less per-task overhead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. `mirai` is very fast; it's much faster than `future` at starting up and has less per-task overhead.
2. `mirai` is very fast; it's much faster than `future` at starting up and has less per-task overhead. `mirai` creates event-driven promises, whereas promises using `future` time-poll every 0.1 seconds. This makes `mirai` ideal where response times and latency are critical.


1. The `future` package tries hard to automatically infer what variables and packages you need from the main R package, and makes those available to the child process. `mirai` doesn't try to do this for you; you need to pass in whatever data you need explicitly, and make `library()` calls explicitly inside of your inner code.
2. `mirai` is very fast; it's much faster than `future` at starting up and has less per-task overhead.
3. `future` is designed to be a general API for managing distributed programming across many types of computing backends. `mirai` is more narrowly scoped: it does support both local and distributed execution, but in the latter case, it only supports `mirai` workers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
3. `future` is designed to be a general API for managing distributed programming across many types of computing backends. `mirai` is more narrowly scoped: it does support both local and distributed execution, but in the latter case, it only supports `mirai` workers.
3. `future` is designed to be a general API supporting many types of distributed computing backends, and potentially offers more options. `mirai` on the other hand is its own system, whilst it does support both local and distributed execution.

1. The `future` package tries hard to automatically infer what variables and packages you need from the main R package, and makes those available to the child process. `mirai` doesn't try to do this for you; you need to pass in whatever data you need explicitly, and make `library()` calls explicitly inside of your inner code.
2. `mirai` is very fast; it's much faster than `future` at starting up and has less per-task overhead.
3. `future` is designed to be a general API for managing distributed programming across many types of computing backends. `mirai` is more narrowly scoped: it does support both local and distributed execution, but in the latter case, it only supports `mirai` workers.
4. `future` puts a big emphasis on complicated scenarios where futures launch other futures ("evaluation topologies"); mirai doesn't care about this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. `future` puts a big emphasis on complicated scenarios where futures launch other futures ("evaluation topologies"); mirai doesn't care about this.
4. `mirai` is inherently queued, meaning it readily accepts more tasks than workers. This means you don’t need an equivalent of `future_promise()`. With `future` you need to manage cases where futures launch other futures ("evaluation topologies") upfront, whereas with `mirai` they will just work.

```

<div class="alert alert-success">
While this article and others on this site focus on the `future` package, there's an up and coming package called [`mirai`](https://shikokuchuo.net/mirai/) that you may want to consider instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
While this article and others on this site focus on the `future` package, there's an up and coming package called [`mirai`](https://shikokuchuo.net/mirai/) that you may want to consider instead.
While this article and others on this site focus on the `future` package, there's a much newer package called [`mirai`](https://shikokuchuo.net/mirai/) that you may want to consider instead.

2. `mirai` is very fast; it's much faster than `future` at starting up and has less per-task overhead.
3. `future` is designed to be a general API for managing distributed programming across many types of computing backends. `mirai` is more narrowly scoped: it does support both local and distributed execution, but in the latter case, it only supports `mirai` workers.
4. `future` puts a big emphasis on complicated scenarios where futures launch other futures ("evaluation topologies"); mirai doesn't care about this.
</div>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
</div>
5. `mirai` supports task cancellation and the ability to interrupt ongoing tasks on the worker.
</div>

@shikokuchuo
Copy link
Member

@jcheng5 I've added a final point about task cancellation. It would be good to get this merged. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants