Minimal changes to allow manual re-initialization of NodeJS process #131

Tales-Carvalho · 2024-03-27T23:22:09Z

This PR aims to solve issue #130 by creating an interface to terminate and initialize the bridge's underlying NodeJS process and threads responsible for the communication.

To do this, I moved the calls responsible for referencing the bridge classes instances to the initialization function, refactored EventLoop and EventExecutorThread's constructors to re-initialize its properties, changed the connection logic so it does not stop upon exception or process termination, and implemented a terminate() function in __init__.py.

With this change, the following behaviour is possible:

import javascript

javascript.init()
javascript.eval_js('console.log("Hello from 1st NodeJS process!")')
javascript.terminate()

javascript.init()
javascript.eval_js('console.log("Hello from 2nd NodeJS process!")')
javascript.terminate()

Upon running this code, each time we terminate() and init(), a new NodeJS process is spawned and the bridge switches to communicate with it. Note that the first eval_js call can be done without calling init() because this function is called within the module's initialization. The function itself checks if the bridge has been previously initialized, so calling it again to ensure it is up is fine.

Most importantly, these changes should not break the current usage of the bridge. I have verified this by running test.py and test_general.py, and no exception is raised.

I have only noticed one caveat about these changes regarding the global variables exposed by the module. When running the bridge interface after re-initializing the NodeJS process at least once, the following code does NOT work:

from javascript import globalThis
# ...
print(globalThis.Date().toLocaleString())

However, this DOES work:

import javascript
# ...
print(javascript.globalThis.Date().toLocaleString())

This is because the variable we get from from javascript import globalThis can only point to the global context of the first process as that's the reference that exists during the module import. On the other hand, calling javascript.globalThis can return the updated reference after the bridge re-initializes. I suspect this is also the case for console and RegExp as they are exposed with the same logic.

A cleaner way to solve this would be to write a getter for each of these variables and expose that instead of the references to variables themselves. For example, we can write in __init__.py:

def get_global_this():
    return config.global_jsi.globalThis

This however would change the interface to the bridge (i.e., we would have to call get_global_this().Date() in the code above), therefore I did not commit it in this PR. Please let me know if this is desirable though.

Tales-Carvalho · 2024-03-27T23:25:20Z

src/javascript/events.py

@@ -163,7 +164,7 @@ def loop(self):
            self.threads = [x for x in self.threads if x[2].is_alive()]

            if len(self.freeable) > 40:
-                self.queue_payload({"r": r, "action": "free", "ffid": "", "args": self.freeable})
+                self.queue_payload({"r": 0, "action": "free", "ffid": "", "args": self.freeable})


About this change: currently this does not work at all because r is not defined. As this calls JS without listening to a return, the request id is ignored so passing "r": 0 works.

extremeheat · 2024-03-28T18:12:25Z

Thanks for working on this. As the lib was not designed around the Node.js process being stopped, all variables and all state depending on the previous node process will stop working and throw errors. Not just globalThis, but require(), and any other function currently stored on the Python side depending on a reference in JS land. This may or may not be handled by the user but if they intend to use this new API call, I think it's OK to assume the user will have written their code in a way to handle this (re-doing imports, using import javascript, etc.)

However I don't understand the removal of the error handling when the Node process is killed or terminated. This is intended to prevent later errors and other undefined behavior. If the Node process crashes this normally means the exception happened outside of a call done by Python as we handle most errors on object doing property accesses and calls. Can you explain this change?

Tales-Carvalho · 2024-03-28T19:56:39Z

However I don't understand the removal of the error handling when the Node process is killed or terminated. This is intended to prevent later errors and other undefined behavior. If the Node process crashes this normally means the exception happened outside of a call done by Python as we handle most errors on object doing property accesses and calls. Can you explain this change?

Thank you for the review! I have initially removed the error handling because that was inadvertently terminating the new Node process. However I took a second look at it and figured out a better way to handle this by checking the current Node process status and adding calls to sendQ, as it's done with other cases. With this I restored the stop() call in the error handling. Please let me know what you think about it!

antont · 2024-03-30T07:26:48Z

Maybe this could be used in a mechanism to restart NodeJS in case it crashes or the JS app there ends up in an invalid state?

I don't think I need multiple instances, but am using the bridge in a long running Python server, to use a JS/TS lib that does websocket calls to a service. We haven't had problems, but was just thinking that maybe something like this could help if we get a need to restart the bridge / node some day.

I will be needing multiple worker processes though but I think those end up with separate Python interpreters and thus bridge instances & node processes anyhow.

extremeheat · 2024-03-31T07:53:00Z

src/javascript/__init__.py

@@ -20,6 +27,11 @@ def init():
 init()


+def terminate():


Can you add a note for this and the other new API method to the docs along with an explanation or how to use?

Tales-Carvalho · 2024-04-01T18:04:44Z

Maybe this could be used in a mechanism to restart NodeJS in case it crashes or the JS app there ends up in an invalid state?

Yes, this is exactly the use case for this change. Note that this does not add support for parallel NodeJS process, but it allows the process to be re-initialized.

With this change, it is now possible to wrap every (independent) JS call with javascript.init() and javascript.terminate(), so the process is always re-initialized when needed. I assume it's also possible to catch bridge exceptions and handle them with terminate() followed by init(), to restore the process for subsequent JS calls.

Tales-Carvalho · 2024-04-01T18:07:02Z

@extremeheat I have just added two paragraphs in docs/python.md about controlling the node process and the API calls, as well as describing the caveat I mentioned in this PR. Please feel free to edit my text and to bring some of it to README.md, if you see necessary.

extremeheat · 2024-04-05T06:30:38Z

Thanks for the contribution! I agree with the comment about improving error handling but that can be handled outside the PR and may require some additional thought (as we'd have the same problem of existing state being unrecoverable, so something would need to be done to deal with this).

Tales-Carvalho added 4 commits March 26, 2024 16:28

Minimal changes to allow manual init/terminate

45ce76f

Init EventLoop queue in constructor

2bea51e

Move other EventLoop and EventExecutorThread variables to constructor

3c4e5c2

Set global references in init()

a06ee46

Tales-Carvalho commented Mar 27, 2024

View reviewed changes

Restore error checking and manage sendQ

972c781

extremeheat reviewed Mar 31, 2024

View reviewed changes

Adds docs about controlling NodeJS process in python

e2a0780

extremeheat approved these changes Apr 4, 2024

View reviewed changes

extremeheat merged commit adab7cf into extremeheat:master Apr 5, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimal changes to allow manual re-initialization of NodeJS process #131

Minimal changes to allow manual re-initialization of NodeJS process #131

Tales-Carvalho commented Mar 27, 2024

Tales-Carvalho Mar 27, 2024

extremeheat commented Mar 28, 2024

Tales-Carvalho commented Mar 28, 2024

antont commented Mar 30, 2024

extremeheat Mar 31, 2024

Tales-Carvalho commented Apr 1, 2024

Tales-Carvalho commented Apr 1, 2024

extremeheat commented Apr 5, 2024

Minimal changes to allow manual re-initialization of NodeJS process #131

Minimal changes to allow manual re-initialization of NodeJS process #131

Conversation

Tales-Carvalho commented Mar 27, 2024

Tales-Carvalho Mar 27, 2024

Choose a reason for hiding this comment

extremeheat commented Mar 28, 2024

Tales-Carvalho commented Mar 28, 2024

antont commented Mar 30, 2024

extremeheat Mar 31, 2024

Choose a reason for hiding this comment

Tales-Carvalho commented Apr 1, 2024

Tales-Carvalho commented Apr 1, 2024

extremeheat commented Apr 5, 2024