-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 4 second timeout for each join attempt in region list. #605
Add 4 second timeout for each join attempt in region list. #605
Conversation
How is this library tested before being released? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
// succeeded. | ||
callCtx, cancelCallCtx := context.WithTimeout(ctx, 4 * time.Second) | ||
joinRes, err = r.engine.JoinContext(callCtx, bestURL, token, params) | ||
cancelCallCtx() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
guessing with exponential backoff in room.go, it could still time out. I think it does a bunch of tries. Should we limit that also to be within like 15 seconds total maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a separate issue and should be fixed by allowing stringing a context.Context
through the upstream parts of the call stack. I didn't want to combine fixing these 2 issues though, I'm just trying to target the outage action item here.
I usually test it with |
That sounds reasonable because I can blackhole clusters by modifying Are there instructions or examples anywhere I should follow for doing what you do with |
The instructions in the repo is good - https://github.com/livekit/livekit-cli. And the help menu of the application also. |
Once I have |
ok, manually tested by blackholing in |
Oh shoot, forgot about that. My bad. Good catch |
return e.JoinContext(context.TODO(), url, token, params) | ||
} | ||
|
||
func (e *RTCEngine) JoinContext(ctx context.Context, url string, token string, params *SignalClientConnectParams) (*livekit.JoinResponse, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -147,7 +158,7 @@ func (c *SignalClient) connect(urlPrefix string, token string, params SignalClie | |||
} | |||
|
|||
header := newHeaderWithToken(token) | |||
conn, hresp, err := websocket.DefaultDialer.Dial(u.String(), header) | |||
conn, hresp, err := websocket.DefaultDialer.DialContext(ctx, u.String(), header) | |||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the err is due to context cancelled, then we should skip checking /validate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah just read the above comment stream. feel free to ignore :)
No description provided.