Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CI/wheel builder for the free-threaded build #190

Merged
merged 6 commits into from
Dec 9, 2024

Conversation

lysnikolaou
Copy link
Contributor

No description provided.

Copy link

codecov bot commented Nov 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.46%. Comparing base (24d3b2e) to head (1123323).
Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #190      +/-   ##
==========================================
+ Coverage   79.81%   81.46%   +1.64%     
==========================================
  Files          24       24              
  Lines        1635     1737     +102     
  Branches      241      239       -2     
==========================================
+ Hits         1305     1415     +110     
+ Misses        184      178       -6     
+ Partials      146      144       -2     

@lysnikolaou
Copy link
Contributor Author

lysnikolaou commented Dec 2, 2024

The Windows failure is related to pypa/setuptools#4662 and should be fixed in the next setuptools release.

@MatthieuDartiailh
Copy link
Member

Thanks. Do you plan to add the global lock as part of this PR ?

@lysnikolaou
Copy link
Contributor Author

lysnikolaou commented Dec 2, 2024

Whatever you prefer @MatthieuDartiailh. I was thinking of adding the global lock as part of another PR so that one is not blocked by the other. Does that work?

Although, if we decide to open a new PR for the global lock, this one should probably be merged without Windows support so there's some kind of CI coverage.

@MatthieuDartiailh
Copy link
Member

I think I would prefer to have everything in that PR because I won't publish wheels without it I think.

@lysnikolaou
Copy link
Contributor Author

I think I would prefer to have everything in that PR because I won't publish wheels without it I think.

Alright, I'll push everything here then.

@lysnikolaou
Copy link
Contributor Author

@MatthieuDartiailh I just completed doing the implementation of the global lock, where locking happens every time we switch from the Python bindings into C++ land. This does not solve all possible threading issues, cause there's still a few of them in the Python binding implementation (eg switching pointers around in shared PyObjects).

I actually also did a somewhat deeper dive into the code and I think that the following would work nicely:

  • Instead of using SharedData and SharedDataPtr, use std::shared_ptr under the free-threaded build. That'll solve any reference counting issues in the C++ implementation.
  • Acquire PyObject-specific locks using Py_BEGIN_CRITICAL_SECTION and friends. Since there's no global state in the C++ implementation (as far as I can see, maybe I'm missing something), and every PyObject holds their own kiwi instances, this would solve any reentrancy issues. The same approach should be used for any threading issues in the Python part of the code.

What do you think?

@MatthieuDartiailh
Copy link
Member

@MatthieuDartiailh I just completed doing the implementation of the global lock, where locking happens every time we switch from the Python bindings into C++ land.

Thanks @lysnikolaou !

This does not solve all possible threading issues, cause there's still a few of them in the Python binding implementation (eg switching pointers around in shared PyObjects).

I am not sure what you refer to here. Could you point me to an example ? Also do you have an idea how this could be addressed ?

I actually also did a somewhat deeper dive into the code and I think that the following would work nicely:

* Instead of using `SharedData` and `SharedDataPtr`, use `std::shared_ptr` under the free-threaded build. That'll solve any reference counting issues in the C++ implementation.

We do not use an atomic reference counted pointer for performance (the C++ implementation is document as not being thread safe because of it). I had hoped we could enforce enough control on the binding to preserve this property of the C++ code. I think having a global lock on creation and destruction of reference counted C++ resource should be sufficient to guarantee the safety of the code (I am conscious doing it right may not be so easy, but do you agree on the principle ?)
I will also try to run the benchmark when using std_shared_ptr to see what would the penalty be.

* Acquire `PyObject`-specific locks using `Py_BEGIN_CRITICAL_SECTION` and friends. Since there's no global state in the C++ implementation (as far as I can see, maybe I'm missing something), and every `PyObject` holds their own kiwi instances, this would solve any reentrancy issues. The same approach should be used for any threading issues in the Python part of the code.

The C++ code does not have any global state as far as I remember (and I will try to check again).

What do you think?

If the current solution is safe, I will likely merge as is to unblock the situation.

In a second time I will try to look at benchmarks using std::shared_ptr to see how much performance penalty we would hit because of it, to see if it worth moving away from a global library lock.

@lysnikolaou
Copy link
Contributor Author

lysnikolaou commented Dec 6, 2024

I am not sure what you refer to here. Could you point me to an example ? Also do you have an idea how this could be addressed ?

I'm referring to the fact that some objects are not immutable. For example, users can change variable contexts in place with setContext, which does this (after a change of mine to use Py_XSETREF):

PyObject*
Variable_setContext( Variable* self, PyObject* value )
{
	if( value != self->context )
	{
		Py_XSETREF(self->context, cppy::incref( value ));
	}
	Py_RETURN_NONE;
}

This is not thread-safe if Variable objects are shared across threads. In order for it to become safe, a lock needs to be held every time context is read/written. I'm not sure how many more instances of things like this exist, but I'll check.

I think having a global lock on creation and destruction of reference counted C++ resource should be sufficient to guarantee the safety of the code (I am conscious doing it right may not be so easy, but do you agree on the principle ?)

I agree with that in principle, yes, but I'm still not clear on whether more localized locks + std::shared_ptr will be a performance boost over one global lock. My assumption is that it will be, so we need to check it out.

If the current solution is safe, I will likely merge as is to unblock the situation.

I'll check whether there's any more instances of thread-unsafe stuff happening on the Python side and let you know today, so that we can merge this. Thanks a lot for the help!

@lysnikolaou
Copy link
Contributor Author

@MatthieuDartiailh I could find any more thread unsafe stuff in the Python bindings (other than the one already mentioned above). I just pushed two commits:

  • Lock around getting and setting variable context so that there's no thread safety issues when sharing variable objects
  • Disable Windows for now (I'll follow up on this once setuptools works correctly)

With these, I think we're good to go!

@lysnikolaou lysnikolaou force-pushed the support-free-threading branch from 64a147d to 01763db Compare December 6, 2024 13:26
@MatthieuDartiailh
Copy link
Member

Looks like you have some include issues.

@lysnikolaou lysnikolaou force-pushed the support-free-threading branch from a1b1005 to 1123323 Compare December 7, 2024 00:35
@MatthieuDartiailh
Copy link
Member

I will merge as is to be able to test the release workflow but I will wait for the setuptools fix to cut a new release.

@MatthieuDartiailh MatthieuDartiailh merged commit 0048efc into nucleic:main Dec 9, 2024
23 checks passed
@MatthieuDartiailh
Copy link
Member

@lysnikolaou I ran the build wheels pipeline and it failed https://github.com/nucleic/kiwi/actions/runs/12267343774/job/34227260381
I will try to look into it but I am interested in your opinion on the matter.

@greglucas
Copy link

It looks like it might need to be cpython-freethreading to the enable env var (notice the c prefix).
https://github.com/pypa/cibuildwheel/blob/9c75ea15c2f31a77e6043b80b1b7081372319d85/docs/options.md?plain=1#L617

pypa/cibuildwheel#2045

@lysnikolaou
Copy link
Contributor Author

It looks like it might need to be cpython-freethreading to the enable env var (notice the c prefix). pypa/cibuildwheel@9c75ea1/docs/options.md?plain=1#L617

pypa/cibuildwheel#2045

Exactly! Just had a look at that as well and opened #191 for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants