Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error ("assertion failed") with very large select clause in query #423

Open
ccdavis opened this issue Jan 28, 2025 · 1 comment
Open

Error ("assertion failed") with very large select clause in query #423

ccdavis opened this issue Jan 28, 2025 · 1 comment

Comments

@ccdavis
Copy link

ccdavis commented Jan 28, 2025

We are generating queries to read very wide parquet files as part of a Census data extraction service. Through experimentation I've determined the limit of named columns in a select clause is 16256. When this is exceeded we get

cli_extractor: /home/ccd/nhgis-extract-engine/rust/target/release/build/libduckdb-sys-14f8e1593a01c721/out/duckdb/src/common/types/row/row_data_collection.cpp:82: duckdb::v
ector<duckdb::BufferHandle> duckdb::RowDataCollection::Build(duckdb::idx_t, duckdb::data_t**, duckdb::idx_t*, const duckdb::SelectionVector*): Assertion `new_block.count >
0' failed.

We realize this is a very large number of columns, and 99.5% of our workload is well under 16256 and runs very well. It would be nice if we could get a good error message, or have a way to increase the maximum width of a result with a setting.

It looks like there's a problem with the memory allocation code, and the actual physical memory of our machines aren't anywhere near used up when I monitor the process. When the column number is 14000 the memory use is pretty low for example.

I can reproduce the issue on version 1.1.1 (the one bundled with the Rust package.) With the CLI I can get a similar problem on v1.1.1 and v1.1.3. With the CLI it tries to dump to a temp file and then just gets stuck. The size of the temp file in .tmp is just 256kb.

When I test on the nightly build downloaded today I get a different error:

ccd@gp2000:~/nhgis-extract-engine/rust$ ~/duckdb/duckdb < test_query.sql
Floating point exception (core dumped)
ccd@gp2000:~/nhgis-extract-engine/rust$

Due to the nature of the problem the query and data file to reproduce the issue are both extremely large but I'm willing to share.

test_query.txt

@ccdavis
Copy link
Author

ccdavis commented Jan 28, 2025

After looking at the other duckdb-rs issues I realized this may belong in the general issues list. The main difference with Rust is that we get the hard stop on the assertion failure which we wouldn't see when using the CLI application.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant