Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] New version of jnosq-arango is not backward compatible #570

Closed
m4ttek opened this issue Jan 2, 2025 · 22 comments · Fixed by eclipse-jnosql/jnosql-databases#306
Closed
Labels

Comments

@m4ttek
Copy link

m4ttek commented Jan 2, 2025

Which JNoSQL project the issue refers to?

JNoSQL Databases

Bug description

As far as I investigated it, the problem is related to changing serialization from Jackson to JsonB with latest version.

Even if I tried to overwrite this new default by using builder.serde(JacksonSerde.create(objectMapper)); the behaviour completely changed and I get errors everywhere.

I don't understand why this changed was applied as a patch version. What should I do now as I'd rather to keep jackson which is used also with JAXRS in my projects.

JNoSQL Version

1.1.4

Steps To Reproduce

Just try to use jnosql-arangodb on existing project

Expected Results

There should be some layer of compatibility applied.

Code example, screenshot, or link to a repository

No response

@m4ttek m4ttek added the bug label Jan 2, 2025
@rashtao
Copy link

rashtao commented Jan 6, 2025

Hi @m4ttek ,

the change is from eclipse-jnosql/jnosql-databases#292.
This was supposed to affect the internal behavior only, i.e. serializing and deserializing to/from CommunicationEntity.

Can you provide code example to reproduce your exceptions?

@otaviojava
Copy link
Contributor

I need to do further tests, but I do believe that impact was changing to JsonObjectonce it is an interface instead of a defined type as we had on BaseDocument. It might impact the serialization.

I will create a test scenario this evening.

@m4ttek
Copy link
Author

m4ttek commented Jan 6, 2025

Hi @rashtao,

I've been always using additional @Column for _key as it was required for proper handling of updates in collections, i.e.:

@AggregateRoot
@Builder
@Entity(COLLECTION_NAME)
@Data
@NoArgsConstructor
@AllArgsConstructor
public class AppraisementPeriod {

    public static final String COLLECTION_NAME = "appraisement_periods";

    @InternalId
    @Id
    private String id;

    @InternalKey
    @Column("_key")
    private String key;

    @Column
    @NotBlank
    private String name;

    @Column
    @NotNull
    private LocalDate startDate;

    @Column
    @NotNull
    private LocalDate endDate;
}

Apart from that, I additionally configured Jackson for handling nested objects and values:

    private void programmaticallyConfigureArangoSerde(ArangoDB arangoDB) {
        if (arangoDB.getSerde().getUserSerde() instanceof JacksonSerde jacksonSerde) {
            // it doesn't change the behaviour of JNosql as it's using its own seralization system
            jacksonSerde.configure(objectMapper -> {
                objectMapper.findAndRegisterModules();
                objectMapper.configure(DeserializationFeature.FAIL_ON_IGNORED_PROPERTIES, true)
                        .configure(SerializationFeature.FAIL_ON_EMPTY_BEANS, false)
                        .configure(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS, false);
            });
        }
    }

But this shouldn't have such impact on everything.

Maybe it'd be better to allow to choose your own serialization system per project? Jackson should be able to handle any java type (apart from polimorfism which require additional annotation based hints).

What's more, I'm usign Spark Arango Datasource in my project which internally uses Jackson, not JsonB - if there are any differences in serde handling I'll probably have even more bugs ;)

@otaviojava
Copy link
Contributor

I need to do further tests, but I do believe that impact was changing to JsonObjectonce it is an interface instead of a defined type as we had on BaseDocument. It might impact the serialization.

I will create a test scenario this evening.

On my side, I could not reproduce the issue. I tried using the Sample on Java-SE with this entity here:

@Entity
public record Hero(

        @Id
        String id,

        @Column
        String name,
        @Column
        String descriptor,

        @Column
        String power) {

So, I tried using the previous version and applied to the new one.

I believe that might be a super specific case.

@m4ttek
Copy link
Author

m4ttek commented Jan 6, 2025

@otaviojava I think that the biggest problem is that @Id annotation represents different value now. It used to be "canonical id" of a record in arango collection.

@otaviojava
Copy link
Contributor

@m4ttek, the issue is finding the objects once we have different mapping.

Hey @rashtao what do you think of returning to ID?

Another option would be setting a properties to let the user define which one will be the Id annotation.

@otaviojava
Copy link
Contributor

@m4ttek I'm creating some tests using your sample; the unique broken point is the update method that is looking only for _dev. Could you check this?

I will fix this scenario. Do you find another issue?

Please let me know so I can fix it.

The JSON serialization does not break.

@otaviojava
Copy link
Contributor

otaviojava commented Jan 6, 2025

@rashtao @m4ttek
Could you review this one?

eclipse-jnosql/jnosql-databases#306

I will merge, and I will release a SNAPSHOT version; please, check if I miss something.

Thanks.

In summary, I've created a scenario and test where the entity has the default value at Id and did the modifications.

@m4ttek
Copy link
Author

m4ttek commented Jan 6, 2025

@otaviojava I've just tested it and it looks better now, but I still cannot share ArangoDatabase instance with other places in code directly where I am dependent on Jackson capabilities and annotations - I would need to be able to change serialization system.

Nevertheless, many thanks!

@otaviojava
Copy link
Contributor

@m4ttek we need to think more about how to archive it

@rashtao
Copy link

rashtao commented Jan 7, 2025

@m4ttek
the mapping of fields annotated with @Id has changed in eclipse-jnosql/jnosql-databases#293

Previous versions of JNOSQL mapped @Id fields to _id document attribute, but this was not correct. In fact in ArangoDB, _id field is read-only (the database ignores the values set by the client) and auto-generated by the database (concatenating <collection-name>/<_key>).

With these changes in place, you can map _id and _key fields by annotating them in this way:

    @Id("_key") 
    private String key;
    
    @Column("_id") 
    private String id;

@otaviojava , as next step I would suggest allowing database implementations specify the default value for @Id annotation. This value, if set, should take precedence over jakarta.nosql.Id#value(), i.e. in:

/**
* Gets the id name, so it reads the {@link Id#value()} otherwise {@link Field#getName()}
*
* @param field the field
* @return the column name
* @throws NullPointerException when the field is null
*/
String getIdName(Field field) {
requireNonNull(field, "field is required");
return Optional.ofNullable(field.getAnnotation(Id.class))
.map(Id::value)
.filter(StringUtils::isNotBlank)
.orElse(field.getName());
}

This would allow for:

@Entity
public class Book {

    @Id
    private String id;

    // ...
} 

to be mapped automatically to field _key when using ArangoDB, without the need of specifying @Id("_key").

@rashtao
Copy link

rashtao commented Jan 7, 2025

@otaviojava
[#306 ](eclipse-jnosql/jnosql-databases#306) LGTM

@rashtao
Copy link

rashtao commented Jan 7, 2025

@m4ttek
w.r.t. customizing serialization behavior, org.eclipse.jnosql.databases.arangodb.communication.ArangoDBConfiguration allows setting a custom serializer via ArangoDBConfiguration#setSerde(ArangoSerde), see:

https://github.com/eclipse-jnosql/jnosql-databases/blob/768a69cf4cbad7f039d5d51472f243c7b518fb98/jnosql-arangodb/src/main/java/org/eclipse/jnosql/databases/arangodb/communication/ArangoDBConfiguration.java#L95-L108

This allows for configuring the underlying JsonbSerde with a custom Jsonb instance, e.g. by creating a new serde with JsonbSerde#JsonbSerde(Jsonb), or even providing a custom implementation of com.arangodb.serde.ArangoSerde. In such case, please note that the serializer should support JsonP types.

For example, you can achieve it this way:

@Alternative
@ApplicationScoped
@Priority(Interceptor.Priority.APPLICATION)
class ArangoDocumentManagerSupplier implements Supplier<ArangoDBDocumentManager> {

    private static final Logger LOGGER = Logger.getLogger(ArangoDocumentManagerSupplier.class.getName());

    @Produces
    public ArangoDBDocumentManager get() {
        Settings settings = MicroProfileSettings.INSTANCE;
        ArangoDBDocumentConfiguration configuration = new ArangoDBDocumentConfiguration();

        //  -- HERE YOU CAN CONFIGURE THE SERDE
        configuration.setSerde(createSerde());
        // --

        ArangoDBDocumentManagerFactory factory = configuration.apply(settings);
        Optional<String> database = settings.get(DOCUMENT_DATABASE, String.class);
        String db = database.orElseThrow(() -> new MappingException("Please, inform the database filling up the property "
                + DOCUMENT_DATABASE.get()));
        ArangoDBDocumentManager manager = factory.apply(db);
        LOGGER.log(Level.FINEST, "Starting  a ArangoDBDocumentManager instance using Eclipse MicroProfile Config," +
                " database name: " + db);
        return manager;
    }

    private ArangoSerde createSerde() {
        return new JsonbSerde(JsonbBuilder.create(new JsonbConfig()
                .withDateFormat("yy-MMM-dd'T'HH:mm:ss.SSS", Locale.GERMANY)
        ));
    }

    public void close(@Disposes ArangoDBDocumentManager manager) {
        LOGGER.log(Level.FINEST, "Closing ArangoDBDocumentManager resource, database name: " + manager.name());
        manager.close();
    }
}

@m4ttek
Copy link
Author

m4ttek commented Jan 7, 2025

@rashtao thanks, however, I think we need to have changes in JsonObject type usage as it won't work properly with Jackson. Probably I can also use Jackson Mixin capability to register proper handling of these types.

I need to do further tests, but I do believe that impact was changing to JsonObjectonce it is an interface instead of a defined type as we had on BaseDocument. It might impact the serialization.

I will create a test scenario this evening.

@rashtao
Copy link

rashtao commented Jan 7, 2025

@m4ttek
you can register your own serde based on Jackson, as long as it supports serializing and deserializing to and from JsonP types. For example, this can be achieved using Jackson JSONP module (com.fasterxml.jackson.datatype:jackson-datatype-jakarta-jsonp:<version>):

class JacksonJsonbSerde implements ArangoSerde {

    private final ObjectMapper mapper;

    public JacksonJsonbSerde() {
        mapper = new ObjectMapper().registerModule(new JSONPModule());
        // ...
        // YOU CAN FURTHER CUSTOMIZE JACKSON MAPPER HERE
    }

    @Override
    public byte[] serialize(Object value) {
        try {
            return mapper.writeValueAsBytes(value);
        } catch (JsonProcessingException e) {
            throw new RuntimeException(e);
        }
    }

    @Override
    public <T> T deserialize(byte[] content, Class<T> type) {
        try {
            return mapper.readValue(content, type);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }

}

This will make handle jakarta.json types from Jackson.

@rashtao
Copy link

rashtao commented Jan 7, 2025

@m4ttek
@otaviojava

From an architectural standpoint, I think that customizing the underlying serialization behavior is quite a specific use case. In general, framework level mapping configuration should be enough for the most of the use cases.

However, for these niche cases, it is still possible for the user to customize things as I wrote above.

Now w.r.t. Jackson vs JsonB, IMHO it is more coherent using JsonB here, since:

  • JNoSQL already has already dependency on it
  • JsonP types are a good target for converting to/from CommunicationEntity
  • Jackson can be compatible with JsonP types using jackson-datatype-jakarta-jsonp (while JsonB is not compatible with Jackson types)

WDYT?

@m4ttek
Copy link
Author

m4ttek commented Jan 7, 2025

@rashtao I agree with you, however, I've been using jnosql for more than a year and I had to sometimes use plain arango driver capabilities in my projects. For example, now I'm working on creating Change Data Capture capability for using it as a Apache Flink Connector and it'll be still based on Jackson.

I'm worried about differences in serialization systems, because JSON does not standardize handling various Java types like dates, decimals, etc. From my experience, we had many problems with changing arango driver from 6.x to 7.x because of some differences in values handling. And the same goes with Spark Arango Datasource, which internally uses Jackson, we had to adjust typing in our spark sql because of some specific changes to serde.

@otaviojava
Copy link
Contributor

otaviojava commented Jan 7, 2025

@otaviojava , as next step I would suggest allowing database implementations specify the default value for @Id annotation. This value, if set, should take precedence over jakarta.nosql.Id#value(), i.e. in:

/**
* Gets the id name, so it reads the {@link Id#value()} otherwise {@link Field#getName()}
*
* @param field the field
* @return the column name
* @throws NullPointerException when the field is null
*/
String getIdName(Field field) {
requireNonNull(field, "field is required");
return Optional.ofNullable(field.getAnnotation(Id.class))
.map(Id::value)
.filter(StringUtils::isNotBlank)
.orElse(field.getName());
}

What we can do is allow alias or nicknames for those annotations, for example, allows a new Id annotation, but for ArangoDB where the default is _key

@rashtao what do you think?

@rashtao
Copy link

rashtao commented Jan 7, 2025

@m4ttek
I understand your concerns, this is quite a common problem which could happen with any untyped database, since mapping conventions (e.g. dates formats, numbers, ...) can be different across different frameworks and libraries. I would recommend explicitly testing for each data type you use.

At mapping level, I think that we cannot have identical behavior across different frameworks or connectors, simply because they adopt different conventions. I.e. in JNoSQL, we support Jakarta NoSQL basic types (https://jakarta.ee/specifications/nosql/1.0/nosql-1.0.0-m1#basic_types), while Spark SQL types are different (https://spark.apache.org/docs/latest/sql-ref-datatypes.html).

Additional differences could arise from JSON serialization libraries and their configuration.

The architectural decision taken in ArangoDB driver 7.x is to separate the responsibilities of serializing user data from serializing ArangoDB own data structures and metadata. In this way we avoid introducing an additional layer of conversion conventions and leave this responsibility to the user. However, to facilitate the usability, by default the driver comes with a serde based on Jackson (see https://docs.arangodb.com/stable/develop/drivers/java/reference-version-7/serialization/#jacksonserde-default-user-data-serde).

In JNoSQL we have mapping conventions that can be customized using AttributeConverter, ValueReader and ValueWriter. These are applied when converting between user entities and CommunicationEntity.

In JNoSQL ArangoDB, we have serialization conventions (JsonB defaults), which can be configured as written above. These are applied when (de-)serializing JsonP types (from)to json.

@rashtao
Copy link

rashtao commented Jan 7, 2025

@otaviojava

I would like avoiding having a different Id annotation to represent the same mapping concept, since it has the risk to be highly error prone from the user.

I think we can achieve this by propagating some metadata information to CommunicationEntity elements, i.e. adding an isId() boolean flag to org.eclipse.jnosql.communication.semistructured.Element.

WDYT?

@rashtao
Copy link

rashtao commented Jan 13, 2025

@m4ttek
note that the latest version of ArangoDB Java driver supports JSON-P types. This removes the need for JsonB serde.
I will create a PR to remove it.

@rashtao
Copy link

rashtao commented Jan 13, 2025

@otaviojava
IMHO we can close this and further discuss the ideas to customize mapping annotations in another GH issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants