Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java driver doesn't process strings with emojis correctly #699

Open
farost opened this issue Oct 21, 2024 · 0 comments
Open

Java driver doesn't process strings with emojis correctly #699

farost opened this issue Oct 21, 2024 · 0 comments

Comments

@farost
Copy link
Member

farost commented Oct 21, 2024

Description

Java driver doesn't process strings with emojis (e.g. 😎) correctly. Instead, it fails on the Rust side's bytes parsing.

Environment

  1. TypeDB distribution: Core
  2. TypeDB version: 3.0.0-alpha-6 and earlier
  3. Environment: Mac

Use a bdd test (available in our BDD repo as cannot create database with an emoji in connection/database):

  Background:
    Given typedb starts
    Given connection opens with default authentication
    Given connection is open: true
    Given connection has 0 databases

  Scenario: cannot create database with an incorrect name
    Then connection create database: 😎; fails

The database name's parsing will fail with an error:

thread '<unnamed>' panicked at c/src/memory.rs:109:13:
called `Result::unwrap()` on an `Err` value: Utf8Error { valid_up_to: 0, error_len: Some(1) }

If we print the received bytes on the Rust side, it will show eda0bdedb88e. In the meantime, we'd expect f09f988e as the UTF-8 representation for this emoji.

Characters inside the Basic Multilingual Plane (like Chinese chars) are processed correctly. The issue seems to be exclusive to chars outside of BMP.

We'll need to modify strings processing in SWIG for Java (and probably other languages... at least Python works correctly, others will be tested later).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant