Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mavproxy.py: Clean up log_writer exit #1496

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Ryanf55
Copy link
Contributor

@Ryanf55 Ryanf55 commented Dec 16, 2024

Purpose

Closes #1414

Details

  • The log thread was marked as daemon=True
  • daemon threads are not recommended anymore by the python maintainers
  • The threading library recommends using Event for thread synchronization
  • The Queue.get() call blocks by default, which may was the reason this was put as a daemon thread
  • By making Queue.get() non-blocking, and checking size before access, because it raises an exception if there is nothing to read, the log writing cleanup can continue as normal. Without this, Queue..get can block forever. When daemon=False, this means the thread stays hung and mavproxy is never closed unless you sigterm it with a 2nd ctrl+C (nasty)
  • There are no longer exceptions when using ctrl+C to stop MAVProxy
  • I use an Event as a signaling mechanism to stop the open threads

Tests Performed

Linux:

  • Start mavproxy by itself (no vehicle), and kill it
  • Start mavproxy with a vehicle, ensure it exits cleanly
  • Start mavproxy and observe logs are not corrupt
  • Test noninteractive mode
  • Test exit command : MANUAL> Unknown command 'exit' => exit doesn't work anymore, even on master
  • Test lack of disk space exits mavproxy
  • Test permissions error opening logfile has mavproxy exit cleanly -> Mavproxy shows "AP: PreArm: Logging failed" in console, and keeps running. I assume that's ok

MacOS:

  • Queue imports work in this workaround
  • Start mavproxy by itself (no vehicle), and kill it
  • Start mavproxy with a vehicle, ensure it exits cleanly
  • Start mavproxy and observe logs are not corrupt
  • Test noninteractive mode
  • Test exit command
  • Test lack of disk space exits mavproxy
  • Test permissions error opening logfile has mavproxy exit cleanly

Windows

  • Start mavproxy by itself (no vehicle), and kill it
  • Start mavproxy with a vehicle, ensure it exits cleanly
  • Start mavproxy and observe logs are not corrupt
  • Test noninteractive mode
  • Test exit command
  • Test lack of disk space exits mavproxy
  • Test permissions error opening logfile has mavproxy exit cleanly

@Huibean
Copy link
Member

Huibean commented Dec 16, 2024

queue imports is fine on macos, can you make a to do list like linux with mac

@Ryanf55
Copy link
Contributor Author

Ryanf55 commented Dec 16, 2024

queue imports is fine on macos, can you make a to do list like linux with mac

Added!

MAVProxy/mavproxy.py Outdated Show resolved Hide resolved
MAVProxy/mavproxy.py Outdated Show resolved Hide resolved
MAVProxy/mavproxy.py Outdated Show resolved Hide resolved
MAVProxy/mavproxy.py Outdated Show resolved Hide resolved
MAVProxy/mavproxy.py Outdated Show resolved Hide resolved
MAVProxy/mavproxy.py Outdated Show resolved Hide resolved
@Ryanf55 Ryanf55 force-pushed the 1414-log-writer-thread-cleanup branch 3 times, most recently from dc3384d to 049288a Compare December 20, 2024 06:03
* Use threading.Event instead of daemon thread to avoid uncaught
  exception on ctrl+C exit
* Daemon threads are not recommended anymore

Signed-off-by: Ryan Friedman <[email protected]>
@Ryanf55 Ryanf55 force-pushed the 1414-log-writer-thread-cleanup branch from 049288a to c146c8f Compare December 20, 2024 06:07
@Ryanf55 Ryanf55 marked this pull request as ready for review December 20, 2024 06:15
@Ryanf55
Copy link
Contributor Author

Ryanf55 commented Dec 20, 2024

queue imports is fine on macos, can you make a to do list like linux with mac

Can you please test this PR on your mac?

@Ryanf55 Ryanf55 requested a review from peterbarker December 20, 2024 06:26
@peterbarker
Copy link
Contributor

... lack of disk space should definitely not cause MAVProxy to exit... that would be a serious bug if it did

@peterbarker
Copy link
Contributor

This looks good to me. I'll test it this weekend.

@Huibean
Copy link
Member

Huibean commented Dec 20, 2024

queue imports is fine on macos, can you make a to do list like linux with mac

Can you please test this PR on your mac?

Yes, I will give feeback later

@Ryanf55
Copy link
Contributor Author

Ryanf55 commented Dec 20, 2024

... lack of disk space should definitely not cause MAVProxy to exit... that would be a serious bug if it did

It's a feature, not a bug:)

MAVProxy/MAVProxy/mavproxy.py

Lines 1006 to 1013 in b39a31c

# make sure there's enough free disk space for the logfile (>200Mb)
# statvfs doesn't work in Windows
if platform.system() != 'Windows':
stat = os.statvfs(logpath_telem)
if stat.f_bfree*stat.f_bsize < 209715200:
print("ERROR: Not enough free disk space for logfile")
mpstate.status.exit = True
return

Thank you both, much appreciated.

@peterbarker
Copy link
Contributor

... lack of disk space should definitely not cause MAVProxy to exit... that would be a serious bug if it did

It's a feature, not a bug:)

MAVProxy/MAVProxy/mavproxy.py

Lines 1006 to 1013 in b39a31c

# make sure there's enough free disk space for the logfile (>200Mb)
# statvfs doesn't work in Windows
if platform.system() != 'Windows':
stat = os.statvfs(logpath_telem)
if stat.f_bfree*stat.f_bsize < 209715200:
print("ERROR: Not enough free disk space for logfile")
mpstate.status.exit = True
return

Thank you both, much appreciated.

OK, that's on startup. That's OK.

@Huibean
Copy link
Member

Huibean commented Dec 22, 2024

MANUAL> Got command: exit
mpstate.settings.requireexit False
Unknown command 'exit'

exit command still shows unknow in mac

@Ryanf55
Copy link
Contributor Author

Ryanf55 commented Dec 22, 2024

MANUAL> Got command: exit
mpstate.settings.requireexit False
Unknown command 'exit'

exit command still shows unknow in mac

Yea, exit is broke on master too, not a regression causes by this PR.

@peterbarker
Copy link
Contributor

  • check CPU usage on all three platforms
  • fix performance regression on Linux with 100% CPU on one core

@Ryanf55
Copy link
Contributor Author

Ryanf55 commented Dec 23, 2024

The CPU wait is caused by the now hot loop in the log writer exit because it's no longer blocking.

In Queue.get(), we're having to deal with Prior to 3.0 on POSIX systems, and for all versions on Windows, if block is true and timeout is None, this operation goes into an uninterruptible wait on an underlying lock. This means that no exceptions can occur, and in particular a SIGINT will not trigger a KeyboardInterrupt..

The API for the multiproc queue in MAVproxy does not support the same arguments, but if it did, we could set a timeout time, and catch the timeout exception to reduce the loop rate.

I could also add a small delay in the loop, which will be easier.

Signed-off-by: Ryan Friedman <[email protected]>
@Ryanf55
Copy link
Contributor Author

Ryanf55 commented Dec 30, 2024

I fixed the hot loop with the simplest option. a 1ms delay. Here's htop results, filtered for mavproxy on my desktop:
image

And, comparsion to master:
image

If you are happy with the change, I will squash the commit

while not mpstate.status.exit:
mpstate.logfile_raw.write(bytearray(mpstate.logqueue_raw.get()))
while not mpstate.status.stop_event.is_set():
if not mpstate.logqueue_raw.empty():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still appear to be looking at the raw queue later in this function - conflict resolution error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event is the exit event. This statement should read

while we are not supposed to exit:
   use the queue for logging operations

Perhaps you missed the not when reading the code? It's the same behavior as before. Run logging if we aren't exiting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MAVproxy >= 1.8.66 causes exception in thread log_writer on exit
3 participants