Add timeout fundamentals for daemon client communication #886

cottsay · 2024-03-05T22:22:28Z

This PR is more of a proof-of-concept than a concrete proposal.

If there is a broken ROS 2 daemon process or another completely unrelated TCP server listening on the corresponding XMLRPC port, it's possible for calls like is_daemon_running to hang for a VERY long time.

For example:

In one shell, start a simple TCP server on port 11511: nc -k -l 11511
In another shell, run ros2 daemon status

You can see the XMLRPC request on the server, but without a response, the call to ros2 daemon status will just sit there. I'm not sure how long it will go before the "global default" timeout will kick in, but I haven't waited long enough to see it.

This can be a particularly bad problem in this package's tests, many of which connect to and sometimes create and destroy daemon processes. It would be nice if those tests didn't hang.

Possible mitigation for problems like #610 and #737

Signed-off-by: Scott K Logan <[email protected]>

fujitatomoya

having HTTPConnection.timeout set looks reasonable to me.

fujitatomoya · 2024-03-06T06:40:24Z

ros2cli/ros2cli/node/daemon.py

+      for the daemon node to respond. If it is not given,
+      the global default timeout setting is used.


using global default timeout would never time out with my environment... trying to access the server to list the method in localhost, it is not expected to take more than 10 seconds, maybe we can set specific timeout in default here?

I can't think of any reasonable circumstances where it would take longer than that either. I'm in favor of a default timeout as well, but I think we need more feedback on what value might be appropriate.

I also think we need to improve the error message if it becomes the default behavior. Right now, the exception is unhandled.

Add timeout fundamentals for daemon client communication

1a0f95e

Signed-off-by: Scott K Logan <[email protected]>

cottsay force-pushed the cottsay/daemon-client-timeout branch from 22bd616 to 1a0f95e Compare March 5, 2024 22:22

fujitatomoya reviewed Mar 6, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add timeout fundamentals for daemon client communication #886

Add timeout fundamentals for daemon client communication #886

cottsay commented Mar 5, 2024 •

edited

Loading

fujitatomoya left a comment

fujitatomoya Mar 6, 2024

cottsay Mar 6, 2024

		for the daemon node to respond. If it is not given,
		the global default timeout setting is used.

Add timeout fundamentals for daemon client communication #886

Are you sure you want to change the base?

Add timeout fundamentals for daemon client communication #886

Conversation

cottsay commented Mar 5, 2024 • edited Loading

fujitatomoya left a comment

Choose a reason for hiding this comment

fujitatomoya Mar 6, 2024

Choose a reason for hiding this comment

cottsay Mar 6, 2024

Choose a reason for hiding this comment

cottsay commented Mar 5, 2024 •

edited

Loading