Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Fix gcs server using shared pointer #48888

Merged

Conversation

dentiny
Copy link
Contributor

@dentiny dentiny commented Nov 23, 2024

Current implementation for shared pointer usage is completely a mess;
Shared pointer should ONLY be used where lifecycle is impossible to judge, for example, async operations or multi-threaded cases; abusing would lead to memory leak (i.e. circular dependency), and makes code hard to justify.

@dentiny dentiny requested review from jjyao and rynewang November 23, 2024 01:12
@dentiny dentiny requested a review from a team as a code owner November 23, 2024 01:12
@dentiny dentiny requested a review from dayshah November 23, 2024 01:12
/// The gcs resource manager.
std::shared_ptr<GcsResourceManager> gcs_resource_manager_;
std::unique_ptr<GcsResourceManager> gcs_resource_manager_;
/// The cluster resource scheduler.
std::shared_ptr<ClusterResourceScheduler> cluster_resource_scheduler_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're still a few shared pointer usage, which I find hard to adapt, leave it to next cleanup.

@dentiny dentiny added the go add ONLY when ready to merge, run all tests label Nov 23, 2024
Copy link
Contributor

@rynewang rynewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nits

@@ -248,7 +248,7 @@ class GcsNodeManager : public rpc::NodeInfoHandler {
/// Storage for GCS tables.
std::shared_ptr<gcs::GcsTableStorage> gcs_table_storage_;
/// Raylet client pool.
std::shared_ptr<rpc::NodeManagerClientPool> raylet_client_pool_;
rpc::NodeManagerClientPool *raylet_client_pool_ = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is never nullptr, so use&, not * ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaik most style guides suggest pointers over references because of flexibility, moving, copying, it doesn't apply to this, but not bad to be consistent, but can we do absl::NotNull, I don't remember if we need cpp upgrade or absl upgrade for it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for using reference if it's guaranteed to be not null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO our use case here does not require flexibility or so: we won't reassign, copy or move them. I'd say we like the fact that it can't (easiliy) be moved by using reference. The only caution is that we need to make sure the references must outlive the objects holding those refs, which is the case since all these classes share a lifetime with a GcsServer (up until a reset call).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we have use cases passing nullptr:

gcs_node_manager =
std::make_unique<GcsNodeManager>(nullptr, nullptr, nullptr, ClusterID::Nil());

@@ -181,15 +183,15 @@ rpc::PlacementGroupStats *GcsPlacementGroup::GetMutableStats() {

GcsPlacementGroupManager::GcsPlacementGroupManager(
instrumented_io_context &io_context,
std::shared_ptr<GcsPlacementGroupSchedulerInterface> scheduler,
GcsPlacementGroupSchedulerInterface *scheduler,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is never nullptr, so use&, not * ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we have places passing in nullptr.

@dentiny dentiny requested a review from rynewang November 25, 2024 08:34
@dentiny
Copy link
Contributor Author

dentiny commented Nov 25, 2024

@rynewang / @dayshah / @jjyao My take / preference on reference vs pointer:

  • Prefer reference over pointer for data member, since it's easy to reason nullability, with no difference on lifecycle semantics
  • Prefer pointer (maybe absl::NotNull<>) over reference for mutable function arguments; const T& -> immutable, T* -> mutable provides a unified and clear semantics.

@dayshah On coding practice,

  • In 2020, google coding style discourages using T& for mutable reference, with the reason I illustrate above;
  • The item has been removed starting from 2022, IIRC

Signed-off-by: hjiang <[email protected]>
@jjyao
Copy link
Collaborator

jjyao commented Nov 25, 2024

There are test failures.

@jjyao
Copy link
Collaborator

jjyao commented Nov 25, 2024

Let's merge this first and we can have the reference vs pointer discussion separately.

@dentiny
Copy link
Contributor Author

dentiny commented Nov 25, 2024

@jjyao I adjust the member order based on the dependency, seems to fix the previously failed unit test. Could you please help me merge the PR? Thank you!

@dentiny dentiny requested a review from aslonnie November 26, 2024 00:53
@aslonnie aslonnie merged commit d0ae2d3 into ray-project:master Nov 26, 2024
5 checks passed
jecsand838 pushed a commit to jecsand838/ray that referenced this pull request Dec 4, 2024
Current implementation for shared pointer usage is completely a mess;
Shared pointer should **ONLY** be used where lifecycle is impossible to
judge, for example, async operations or multi-threaded cases; abusing
would lead to memory leak (i.e. circular dependency), and makes code
hard to justify.

---------

Signed-off-by: hjiang <[email protected]>
Signed-off-by: Connor Sanders <[email protected]>
dentiny added a commit to dentiny/ray that referenced this pull request Dec 7, 2024
Current implementation for shared pointer usage is completely a mess;
Shared pointer should **ONLY** be used where lifecycle is impossible to
judge, for example, async operations or multi-threaded cases; abusing
would lead to memory leak (i.e. circular dependency), and makes code
hard to justify.

---------

Signed-off-by: hjiang <[email protected]>
Signed-off-by: hjiang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants