Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple interpreter instance? #302

Open
Gnimuc opened this issue Jun 2, 2024 · 15 comments
Open

Support multiple interpreter instance? #302

Gnimuc opened this issue Jun 2, 2024 · 15 comments
Labels

Comments

@Gnimuc
Copy link
Contributor

Gnimuc commented Jun 2, 2024

static std::unique_ptr<compat::Interpreter> sInterpreter;
// Valgrind complains about __cxa_pure_virtual called when deleting
// llvm::SectionMemoryManager::~SectionMemoryManager as part of the dtor chain
// of the Interpreter.
// This might fix the issue https://reviews.llvm.org/D107087
// FIXME: For now we just leak the Interpreter.
struct InterpDeleter {
~InterpDeleter() { sInterpreter.release(); }
} Deleter;
static compat::Interpreter& getInterp() {
assert(sInterpreter.get() && "Must be set before calling this!");
return *sInterpreter.get();
}
static clang::Sema& getSema() { return getInterp().getCI()->getSema(); }
static clang::ASTContext& getASTContext() { return getSema().getASTContext(); }

The interpreter is global and unique at the moment. However, clients do not always agree on the same compiler configuration. I think we need to add more APIs that take compat::Interpreter as argument. For example,

const char* GetResourceDirFromInterpreter(const compat::Interpreter* interp) {
    return interp->getCI()->getHeaderSearchOpts().ResourceDir.c_str();
}
@vgvassilev
Copy link
Contributor

vgvassilev commented Jun 2, 2024

We have thought about that and initially this was part of the design. However, this makes the interfaces quite ugly and we moved away from that approach.

Perhaps, we could have some list of interpreters and we could change the "active" interpreter. Would that work for your use case?

@Gnimuc
Copy link
Contributor Author

Gnimuc commented Jun 2, 2024

That could be a solution as well. At least, we need a setInterp() API to allow clients to switch the current interpreter. But it's tricky to do it right in a multi-threading context...

@vgvassilev
Copy link
Contributor

I agree... Do you have an implementation in mind?

@vgvassilev
Copy link
Contributor

I suspect we want to replace the static interpreter instance with an user-provided callback or something like that where the user is responsible for maintaining the top of the interpreter stack?

@Gnimuc
Copy link
Contributor Author

Gnimuc commented Jun 3, 2024

I agree... Do you have an implementation in mind?

No. I'm still experimenting with the library. I'm wondering how cppyy solves this problem.

Another way is to expose a C wrapper over the internal CppInterOp Interpreter, so users can manage those instances on their own. We only need to maintain a stable C API. The CppInterOpInterpreter.h can still be internal.

The following C++ APIs are just helper functions or wrappers over CppInterOpInterpreter.h's public methods. They are simple and short, so the C API version(with the ugly interpreter argument) can be easy to reimplement.

void AddSearchPath(const char *dir, bool isUser, bool prepend);
const char* GetResourceDir();
void AddIncludePath(const char *dir);
int Declare(const char* code, bool silent);
int Process(const char *code);
intptr_t Evaluate(const char *code, bool *HadError/*=nullptr*/);
std::string LookupLibrary(const char* lib_name);
bool LoadLibrary(const char* lib_stem, bool lookup);
void UnloadLibrary(const char* lib_stem);
std::string SearchLibrariesForSymbol(const char* mangled_name, bool search_system /*true*/);
std::string ObjToString(const char *type, void *obj);
void Destruct(TCppObject_t This, TCppScope_t scope, bool withFree /*=true*/);
TCppFuncAddr_t GetFunctionAddress(const char* mangled_name);
CPPINTEROP_API JitCall MakeFunctionCallable(TCppConstFunction_t func);

However, these two APIs are not trivial to reimplement.

intptr_t GetVariableOffset(TCppScope_t var); 
bool InsertOrReplaceJitSymbol(const char* linker_mangled_name, uint64_t address);

I think we could do what MakeFunctionCallable did here:

static JitCall::DestructorCall make_dtor_wrapper(compat::Interpreter& interp,

In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

@vgvassilev
Copy link
Contributor

I agree... Do you have an implementation in mind?

No. I'm still experimenting with the library. I'm wondering how cppyy solves this problem.

Cppyy is not know to have good interfaces to communicate with the underlying infrastructure. It wraps them into multiple layers via the CPyCppyy and the cling-backend. Let's ping @wlav for a better take on this.

Another way is to expose a C wrapper over the internal CppInterOp Interpreter, so users can manage those instances on their own. We only need to maintain a stable C API. The CppInterOpInterpreter.h can still be internal.

Yes, right now CppInterOp is not in a good place in that respect. The intent has been that these interfaces to be C compatible and to not change a lot. However, it is really challenging to provide C-compatible interfaces when we deal with collections (std::vectors) or provide storage.

The following C++ APIs are just helper functions or wrappers over CppInterOpInterpreter.h's public methods. They are simple and short, so the C API version(with the ugly interpreter argument) can be easy to reimplement.

void AddSearchPath(const char *dir, bool isUser, bool prepend);
const char* GetResourceDir();
void AddIncludePath(const char *dir);
int Declare(const char* code, bool silent);
int Process(const char *code);
intptr_t Evaluate(const char *code, bool *HadError/*=nullptr*/);
std::string LookupLibrary(const char* lib_name);
bool LoadLibrary(const char* lib_stem, bool lookup);
void UnloadLibrary(const char* lib_stem);
std::string SearchLibrariesForSymbol(const char* mangled_name, bool search_system /*true*/);
std::string ObjToString(const char *type, void *obj);
void Destruct(TCppObject_t This, TCppScope_t scope, bool withFree /*=true*/);
TCppFuncAddr_t GetFunctionAddress(const char* mangled_name);
CPPINTEROP_API JitCall MakeFunctionCallable(TCppConstFunction_t func);

I think it is really better to provide a C interface. That's why we will be safe.

However, these two APIs are not trivial to reimplement.

intptr_t GetVariableOffset(TCppScope_t var); 
bool InsertOrReplaceJitSymbol(const char* linker_mangled_name, uint64_t address);

I think we could do what MakeFunctionCallable did here:

static JitCall::DestructorCall make_dtor_wrapper(compat::Interpreter& interp,

In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

I agree. Once you are beyond the experimental part of this we can discuss about what are the needs. One thing I'd like to point out is that in an ideal world a lot of this code will land in llvm-project/clang/lib/Interpreter and perhaps there is where we could use C style interfaces, too...

@wlav
Copy link

wlav commented Jun 3, 2024

Preface with a note that the multi-interpreter case is a common one, especially if they could be spun up quickly, or "forked" from the state of a parent interpreter. The latter would be awesome for writing C++ unit tests.

Cppyy is not know to have good interfaces to communicate with the underlying infrastructure.

Well, it worked for 3 applications so far. :) CPyCppyy, PyPy/_cppyy, and the D language bindings. There have been a few others that I'm aware of, e.g. an NLP application, but AFAIK, these were one-offs grad student projects.

Just that I have misgivings about it b/c it's been grown organically and even as that hasn't limited functionality, there are several inefficiencies that I don't think are necessary.

It wraps them into multiple layers via the CPyCppyy and the cling-backend. Let's ping @wlav for a better take on this.

No, CPyCppyy isn't part of it. The layerings are clingwrapper and ROOT/meta. For the former, the above message applies, the latter, well ... :}

    We only need to maintain a stable C API.
    [..]
    it is really challenging to provide C-compatible interfaces when we deal
    with collections (std::vectors) or provide storage.

Right, that's how clingwrapper started as well: to bootstrap C++ bindings into PyPy, we had to start with C bindings and that's where several of the inefficiencies stem from. Eg., as you say, it's hard to share a container with C (esp. since the C binding in PyPy is ABI-based), so that's in the current implementation, there are several places where it loops with an
external index over an internal data structure.

    In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

I did the same later: re-implemented the C interfaces on top of the C++ ones. However, it's still mostly 1-to-1, the C wrappers taking care of some amount of memory handling.

Copy link
Contributor

github-actions bot commented Dec 4, 2024

This issue is stale because it has been open for 90 days with no activity.

@github-actions github-actions bot added the stale label Dec 4, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 25, 2024
@aaronj0 aaronj0 reopened this Dec 25, 2024
@github-actions github-actions bot removed the stale label Dec 26, 2024
@wlav
Copy link

wlav commented Dec 30, 2024

You know, being able to have multiple interpreters and in particular being able to start/run/discard would be a killer feature and if the cleanup/ClangREPL is what enables it by removing all the ROOT/meta crud, it'd be a major reason to move. Maybe reconsider that "not planned"?

@vgvassilev
Copy link
Contributor

That's the annoying bot that comes after us if something is stale...

@mcbarton
Copy link
Collaborator

@vgvassilev if you want the bot to ignore this issue then we can add a to-fix label

@wlav
Copy link

wlav commented Jan 8, 2025

The principle is sound:

>>> import cppyy, libcppyy
>>> libcppyy._interpreter_enter()
>>> cppyy.cppdef("int AaAa = 42;")
True
>>> cppyy.cppdef("int BbBb = 17;")
True
>>> print(cppyy.gbl.AaAa)
42
>>> libcppyy._interpreter_exit()
>>> print(cppyy.gbl.BbBb)
Traceback (most recent call last):
  File "<python-input-6>", line 1, in <module>
    print(cppyy.gbl.BbBb)
          ^^^^^^^^^^^^^^
AttributeError: <namespace cppyy.gbl at 0x3a992f70> has no attribute 'BbBb'. Full details:
  type object '' has no attribute 'BbBb'
  'BbBb' is not a known C++ class
  'BbBb' is not a known C++ template
  'BbBb' is not a known C++ enum
>>> libcppyy._interpreter_enter()
>>> print(cppyy.gbl.BbBb)
17
>>> 

So that in theory at least should allow me to have different C++ project be bound within their own interpreter, each with their own PCH, no?

@vgvassilev
Copy link
Contributor

Sure. We can create a stack of interpreters and you can work with by changing the active one. Do you have a minimal motivating example that we can start from?

@wlav
Copy link

wlav commented Jan 8, 2025

Making a stack may not work; at least as-is, the callbacks would be mighty confused. I was planning to compartmentalize completely all JITing tasks in clingwrapper, by tracking a handle on the Python side (you can't bootstrap this; above, I'm working through the C API).

A good example to start would be to have 2 independent Python modules, each with their own interpreter, for Eigen/Dense and Eigen/Sparse. This should make the performance of bindings creation of each individual module faster. If something like that works, NWChemEx may be interested in moving back to cppyy. (They dropped it, for now, b/c they always loaded all their modules right off the bat, and that took 7s just in parsing time, making it virtually impossible to run unit tests.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants