-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only run backend autodetection once per process #83
Conversation
53ae0d9
to
064a29d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!! Thank you for doing this rearrangement—the speedup sounds great.
I understand the problem with API layering you described in #81, but I think we can solve it a different way without introducing the _backends
global variable. I'm worried about introducing this implicit, hidden global state—this kind of thing can make the library unpredictable in corner cases, possibly problematic in multithreaded programs, and sometimes inconvenient in testing suites, which need to be careful to reset the state between tests.
Can we start first with a more explicit design? A straightforward way might be to keep the new backends
parameter to audio_open
but to automatically save away the result of available_backends
.
It will be a little more work, but then we can also let pyacoustid
clients specify their backends. I think it's OK for the beets chroma plugin to directly use audioread
to save this backend information.
audioread/ffdec.py
Outdated
if proc.returncode == 0: | ||
return True | ||
else: | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just return proc.returncode == 0
or return not bool(proc.returncode)
would be a little more concise than this if
/else
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
audioread/__init__.py
Outdated
result.append(maddec.MadAudioFile) | ||
|
||
# FFmpeg. | ||
if _ffmpeg_available(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably fine to just call ffdec.available()
here instead of making a _ffmpeg_available
function here. The only reason the import
is conditional for the MAD backend, for example, is that the import will fail if pymad isn't installed. The ffmpeg backend can be imported anyway (and must be imported to run the check!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also worth noting (but not worth worrying about) that this is introducing a small overhead in the one-call case: we now have to invoke ffmpeg twice, once as a backend availability check and once to actually do the decoding. In the old world, we just ran the ffmpeg command once and let it fail if the command isn't available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll change it.
Thanks for the review. Yes, it's a good idea to avoid global state. We might need to rationalize the situation in pyacoustid and the beets/chroma plugin though:
|
Yes, sounds good!
Oh right; I completely forgot about that feature. This makes the situation a little more complicated, for both pyacoustid and for beets. It is actually really quite convenient for pyacoustid to support the command-line tool, which sometimes easier to install (e.g., on Windows) than the library. On the other hand, talking directly to the library is the "right way" to do things on systems with a proper package manager. Here's a strategy that might be nice to have, even if we weren't doing this whole "persistent backend detection" thing: empower pyacoustid with flags that control which backend is used. The appropriate function could behave as it does now when nothing specific is supplied (preserving backwards compatibility), or it could:
Having that additional level of control would be nice, and it wouldn't impact the configuration-free use case either. |
We were reading audio data with the Gst.Buffer.extract_dup() method. This allocates new memory using g_malloc() and returns it to the caller. The memory needs to be freed with g_free(), however the PyGObject bindings do not do this. We can avoid problem by reading the audio data directory from the underlying Gst.Memory object. In this case the Python interpreter is responsible for copying the data and so it is able to correctly free the memory after it's no longer needed. I tested this by calling pyacoustid.fingerprint() on 34 .MP3 files in sequence, and I saw the following difference: - memory usage without the patch: 557052 KB - memory usage with the patch: 52752 KB The generated acoustid fingerprints were identical with and without the patch.
The backend detection can be inefficient. For example, detecting CoreAudio may lead to 'ctypes' invoking a C compiler in order to search for the library. If a process calls audio_open() many times, it will be more efficient if backend detection only happens once. This is now possible as there is a new available_backend() method which you can call once and then save the result. The list of backends can be passed to audio_open() in a new 'backends' keyword arg. Fixes beetbox#81
This avoids a circular dependency between __init__.py and the decoder modules.
064a29d
to
774af29
Compare
Since #84 I've weirdly not seen any performance issues related to backend autodetection. I'm not sure why. However, I think it's still a good idea to add a separate |
Only run backend autodetection once per process
Looks great; thank you! This will be useful to have in case we find performance problems due to backend detection in the future. On that note, thanks for all your attention to this project over the last few weeks! I've sent you an org invitation in case you'd like to commit more work in the future. PRs are of course always still an option if you want a code review, but you should now have permission to merge your own if that ever seems appropriate. |
OK, thanks! I promise not to sneak in any bitcoin mining code ;) |
This is a performance improvement. See #81 for details.