Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL -> DuckDB export causes segfault #90

Closed
2 tasks done
benjamingr opened this issue Oct 6, 2024 · 1 comment · Fixed by #98
Closed
2 tasks done

MySQL -> DuckDB export causes segfault #90

benjamingr opened this issue Oct 6, 2024 · 1 comment · Fixed by #98

Comments

@benjamingr
Copy link

benjamingr commented Oct 6, 2024

What happens?

When running MySQL -> DuckDB import on a simple setup (testcontainers) a segfault occurs.

To Reproduce

When setting up test containers with duckdb to evaluate MySQL export as such (full fixture setup code):

def populate_mysql(mysql: MySqlContainer):
    connection = pymysql.connect(
        host=mysql.get_container_host_ip(),
        port=int(mysql.get_exposed_port(3306)),
        user=mysql.username,
        password=mysql.password,
        database=f"test",
        autocommit=True
    )

    with connection.cursor() as cursor:
        for table_idx in range(TABLE_COUNT):
            # Create users table with columns "id", "email", "password_hash", "credit_card_number" with fake PII
            cursor.execute(
                f"CREATE TABLE users{table_idx} (id INT AUTO_INCREMENT PRIMARY KEY, email TEXT, password_hash TEXT, credit_card_number TEXT);")
            # Insert some fake data
            for i in range(100):
                cursor.execute(
                    f"INSERT INTO users{table_idx} (email, password_hash, credit_card_number) VALUES ('user{i}@example.com', 'hash{i}', '1111-2222-3333-4444');")
            # Create orders table with columns "id", "user_id", "order_date", "order_total", "address" with fake PII
            cursor.execute(
                f"CREATE TABLE orders{table_idx} (id INT AUTO_INCREMENT PRIMARY KEY, user_id INT, order_date TIMESTAMP, order_total DECIMAL(10,2), address TEXT);")
            # Insert some fake transactions
            for i in range(100):
                cursor.execute(
                    f"INSERT INTO orders{table_idx} (user_id, order_date, order_total, address) VALUES ({(i + 10) % 10}, '2022-01-01 00:00:00', 100.00, '1234 Main St, Springfield, IL 62701');")
    connection.close()

And then connecting to the testcontainer and performing "export from" on it (similar PostgreSQL code works):

INSTALL mysql;
LOAD mysql;

ATTACH 'database=test user=test password=test host=localhost port=60057' AS d (TYPE MYSQL, READ_ONLY);

--- Executing this line causes a segfault, crashes DBeaver (or Python when running Python code)
COPY FROM DATABASE d TO memory;

OS:

macOS 14.3 (23D56)

DuckDB Version:

latest ( "1.1.1" )

DuckDB Client:

Python (also with DBeaver)

Hardware:

M3 max

Full Name:

Benjamin Gruenbaum

Affiliation:

Eon.io

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have
@benjamingr benjamingr changed the title DB export from mysql causes segfault MySQL -> DuckDB export causes segfault Oct 6, 2024
@benjamingr
Copy link
Author

(Just for reference, running regular queries against the MySQL code works, e.g. SELECT * from d.users0 works and returns the correct results)

@szarnyasg szarnyasg transferred this issue from duckdb/duckdb Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant