Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve RCON packet Parsing and Cleanup on connection close to prevent deadlocks #267

Closed
wants to merge 3 commits into from

Conversation

ect0s
Copy link
Contributor

@ect0s ect0s commented Jan 16, 2023

Restructure Packet Decode
Add counter for packets/callbacks
onClose() cleanups
unknown packet should call onClose().

This isn't production ready, but I felt the need to pull request it and solicit feedback.

This is primarily to fix the issues as noted in #200 and #249

This should run as is, I have tested to above 200k RCON command requests, and across server restarts. Before these changes I could cause an error within the first few thousand requests.

The error handling, in particular, could be improved, but I was trying to keep the code broadly similar to the current release.

Longer term, a rewrite might incorporate the new 'count/id' field/callback tracking/retries into a single object, thus bringing some relevant code together

…prevent deadlocks, call onClose on Unknown packets

Restructure Packet Decode
Add counter for packets/callbacks
onClose() cleanups
unknown packet should call decode.
@ect0s ect0s linked an issue Jan 16, 2023 that may be closed by this pull request
@lbzepoqo
Copy link
Contributor

I've been using these changes without restarting my SquadJS and haven't encountered a single unknown packet since then.

Before, we were experiencing an issue where Admin Requests plugin, JetDave's map vote and Whitelister, along with other commands doesn't get read by SquadJS, but the process is still running and broadcasts were running fine. As I was investigating recently, we usually get 5 or more unknown packets within a single day and this is just when we notice that things doesn't work any more than we restart. As I discovered process.abort(), I included it in the code to crash our SquadJS instead of me manually restarting SquadJS and actively monitoring all day. In a span of 7-8 hours of active time, we get around 5+ of these unknown packets per day, and we had one even if the server wasn't full yet at 6/100.

My first encounter with this issue can be found in a Discord message here. Since then, we've been running SquadJS to restart every ONE hour as suggested by Forrest Gump who also had an instance in Southnode where the SquadJS is located on a different server and is using FTP. They haven't experienced this when they went dedicated. I even requested for JetDave's AutomaticSubsystemRestarter as a workaround.

Also to mention, when an unknown packet appears in our log, the DiscordSubsystemRestarter is enough to fix the issue but still we had to do it manually. The problem with this remedy is that it doesn't restore the functionality of plugins using the websocket (for example: Whitelister's !profile command). There's also an issue where using the DiscordSubsystemRestarter or JetDave's AutomaticSubsystemRestarter sometimes causes this issue #262 which lead me to the option to just abort the process once the issue occurs.

I've been running with this for almost 3 days straight already without any unknown packets and restarts so far. I had to restart SquadJS earlier due to a bug fix for the map vote and will continue to test it and run it the whole week.

This is the first time we reached more than 24 hours without restarting and any issues. It's so weird to look at our directory only containing one log file.

@ect0s
Copy link
Contributor Author

ect0s commented Jan 19, 2023

Small update above;

Found that commands run on timers (such as the updatePlayer lists etc) can sometimes run before we have been authorized, added a flag.

I've let this run for 3 days, including during/over several restarts (including a multihour restart from a server migration).

So far, no issues.

Copy link
Collaborator

@werewolfboy13 werewolfboy13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appears to look good, I don't have a server to test on since my RCON is heavily customized.

@ect0s ect0s added the core bug Bug related to the core SquadJS API label Jan 21, 2023
@werewolfboy13
Copy link
Collaborator

@fantinodavide can you verify that I pushed this correctly?

@steelskillet
Copy link
Contributor

That did not work correctly. Please revert.

@fantinodavide
Copy link
Contributor

@werewolfboy13 lots of changes after the merge and even if it works, all the new changes made by ect0s have been removed

@werewolfboy13
Copy link
Collaborator

Broke the merge, new PR to fix. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core bug Bug related to the core SquadJS API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SquadRcon.onData Crashing Recently
5 participants