Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Butterfly classloader module isolation causes Jackson problems #15

Open
tfmorris opened this issue Jul 13, 2020 · 12 comments
Open

Butterfly classloader module isolation causes Jackson problems #15

tfmorris opened this issue Jul 13, 2020 · 12 comments

Comments

@tfmorris
Copy link
Member

As reported in OpenRefine/OpenRefine#1882
with analysis in OpenRefine/OpenRefine#1889

Butterfly implements its own ClassLoader called ButterflyClassLoader, with some custom logic to reload classes on the fly when they are modified.

To do so, the method loadClass from the parent class URLClassLoader was overridden, with a home-made interpretation of the delegation model which was incorrect: the problem was that it tried to load a class first by itself before trying the parent loader (whereas the parent loader should be tried first).

This lead to Jackson annotations being marked as loaded by ButterflyClassLoader instead of WebappClassLoader, so the hashes of the classes would differ and the Jackson introspection code would not find them when generating serializers and deserializers for POJOs coming from extensions.

@tfmorris
Copy link
Member Author

tfmorris commented Jul 13, 2020

I stumbled across the commented out classloader code (which is not associated with any PR, but is in 0de01cb) the other day and, independently stumbled across the unlinked commentary in OpenRefine/OpenRefine#1889 just now.

I'm not sure what other workarounds are possible for Jackson's particular troubles, but the Butterfly classloader organization was intentional. It wasn't a "misinterpretation" of the delegation model, but, as I interpret it, an intentional firewalling mechanism to keep modules from interfering with each other. Many extension models have similar protection mechanisms, so OpenRefine may need a different mechanism than insisting that all code share a common classloader.

@Override
protected Class<?> loadClass(String name, boolean resolve) throws ClassNotFoundException {
Class<?> clazz = findLoadedClass(name);
if (clazz == null) {
try {
_logger.trace("Loading: {}", name);
clazz = findClass(name);
} catch (ClassNotFoundException cnfe) {
try {
_logger.trace("Parent loading: {}", name);
ClassLoader parent = getParent();
clazz = parent.loadClass(name);
} catch (ClassNotFoundException cnfe2) {
try {
_logger.trace("Current loading: {}", name);
ClassLoader current = this.getClass().getClassLoader();
clazz = current.loadClass(name);
} catch (ClassNotFoundException cnfe3) {
_logger.trace("System loading: {}", name);
ClassLoader system = ClassLoader.getSystemClassLoader();
clazz = system.loadClass(name);
}
}
}
}
if (resolve) {
resolveClass(clazz);
}
return clazz;
}

@wetneb
Copy link
Member

wetneb commented Jul 14, 2020

Yes I understand that this custom class-loading was intentional. Yes it could be useful to have something that enables extensions to use different versions of libraries, so provide the same sort of isolation. The solutions I have in mind are:

  • start a big refactoring to provide our own interface to handle JSON serialization / deserialization. Extensions would need to rely on that instead of Jackson (for me, the cost/benefit ratio is way too high)
  • convince Jackson to index classes not by hash of class object but by class name (it looks very hard to me too)
  • find a way to enforce that the same classloader is used for classes whose versions do not differ between main and extensions. That would amount to requiring that extensions use the exact same Jackson version, but would still be able to override other dependency versions thanks to the custom classloading code (looks hacky to me)
  • migrate out of butterfly into another framework which provides a similar extension mechanism and has this all figured out in the framework already (expensive too)
  • anything else?

Of course I found out about this problem extremely late in the migration process, and that did not really help find a satisfying solution. The problem only occurs with real-world testing of extensions (units tests are not affected as the class loading mechanisms are different when running tests).

@tfmorris
Copy link
Member Author

Yes I understand that this custom class-loading was intentional.

Thanks for the clarification. That wasn't clear to me from what I read before.

I'm still not clear on what the exact problem is/was. Can you expand with a little more detail (perhaps an example)? Is it to do with preferences or communication between the core and extensions or the front end and backend or something else entirely?

The purpose in creating this issue wasn't to do something different immediately or even consider the issue now. It was just to get the existing history captured since we use Github as our principal form of institutional memory.

@wetneb
Copy link
Member

wetneb commented Jul 14, 2020

Here is a summary of the issue:

We use Jackson to serialize / deserialize our classes in JSON. Jackson makes it possible to specify how a class is (de)serialized by annotating it with its own set of annotations. Given a class to (de)serialize, Jackson generates a (de)serializer on the fly by looking at the annotations on the class. In Java, annotations are classes themselves, so they are associated to the classloader that loaded them. This classloader is taken into account when checking for equality of classes (and hashCode), so two annotations classes loaded by different classloaders are considered different.

If an extension declares a new operation, or overlay model, or anything else that will be serialized to JSON by the main application, this is a problem. With Butterfly's custom classloader, Jackson annotations are marked as loaded by ButterflyClassLoader instead of WebappClassLoader, so the hashes of the classes would differ and the Jackson introspection code would not find them when generating serializers and deserializers for POJOs coming from extensions.

The issue on Jackson's side is here: FasterXML/jackson-databind#542

That being said, one low-hanging solution for this comes to mind as I write this: adding jackson as a provided dependency to the extensions could ensure that the ButterflyClassloader fails to find them in the extension and falls back on the main classloader. Perhaps it is that simple - I suspect I have tried that back in the days, but there is a slight chance that I did not. If it does work it would be by far the best way forward.

@wetneb
Copy link
Member

wetneb commented Jan 12, 2022

I have no idea what I am talking about but perhaps the new Java module system introduced in Java 9 could help with this sort of problem.

@thadguidry
Copy link
Member

  • Modules are essentially a "package of packages" that can declare requires transitive as necessary or requires static and give strong encapsulation.
  • One thing to note is that by default, all packages are module private but you can export...to to expose to only certain packages as necessary.
  • Another thing is thinking about ButterflyClassloader and would it itself be an Application Module or perhaps even a Service by signifying with module my.module { uses ButterflyClassloader; } or even considering it a Service Provider that provides ?

I think only you can answer that having some knowledge of what Butterfly currently gives to OpenRefine extensions.
Good Reference: https://www.baeldung.com/java-9-modularity

@tfmorris
Copy link
Member Author

tfmorris commented Mar 8, 2024

Revisiting this in the context of the OpenRefine extension ecosystem discussion, I think it would be useful to restore the module isolation which was disabled.

While I think exposing Jackson in the extension interface is dangerous for the same reason that having org.json objects in the extension interface was dangerous, as I previously argued

If an extension gets broken by the JSON serialization change, I'd argue that that indicates that it's not well enough isolated and the solution isn't to make the abstraction even more leaky, but, rather, to force the extensions to provide their own JSON serialization, unless it's a facility provided by OpenRefine through an opaque API.

a bandaid fix might be to declare Jackson databind to be a sort of bootstrap dependency that gets loaded first by Butterfly or to otherwise provide special treatment of the necessary classes (effectively make Jackson a core Butterfly facility).

@tfmorris
Copy link
Member Author

I took a closer look as I was preparing to implement the bootstrapping scheme described above and discovered that it's exactly as simple as @wetneb described:

adding jackson as a provided dependency to the extensions could ensure that the ButterflyClassloader fails to find them in the extension and falls back on the main classloader.

There are actually two sources of conflict: slf4j and Jackson. The Jackson conflict seems to be with ObjectMapper declarations rather the annotations, but I pulled in both just to be safe.

@tfmorris
Copy link
Member Author

Fixing all the bundled extensions works, but it only takes a single rogue extension to mess things up again for all. For example, an older version of the rdf-extension bundles jena-shaded-arq-3.8.0.jar which is a shaded "uber jar" that includes its own jackson-core (among other bundled dependencies).

Contrary to my assumption above, the Butterfly classloader doesn't actually provide intermodule classloading isolation, but merges all the modules classpaths together into a single classloader,

File classes = new File(path,"MOD-INF/classes");
if (classes.exists()) {
_classLoader.addRepository(classes);
}
File libs = new File(path,"MOD-INF/lib");
if (libs.exists()) {
_classLoader.addRepository(libs);

so the database module can have its Jackson loading broken by the mere presence of the rdf-extension.

Also, because the order of module loading is effectively random (hashset iteration order), the priority of modules on the classpath relative to each other is random.

@wetneb
Copy link
Member

wetneb commented Apr 9, 2024

This discussion really encourages me to go for an established plugin system. I mean, honestly, I am really not knowledgeable on Java class loading details (as can be seen with the classloading change I made), and I don't think it's a good use of our time to become experts on that and implement something solid, when there are existing solutions we could adopt (OSGi, PF4J, maybe others). I have written more about this on the forum.

@tfmorris
Copy link
Member Author

Fixing the existing bug (calling it bug because I'm pretty sure the implementation doesn't match the authors' implied intentions) is straightforward and other implementation details aren't really as complicated as deciding what the requirements should be. That part needs to be done even if it's just to define requirements for selecting a new and different implementation.

@wetneb
Copy link
Member

wetneb commented Apr 24, 2024

Sure! If we do decide to adopt an existing plugin system, it's not something that's going to happen overnight, so I wouldn't make that block up any fixes on the current system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants