Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestDefaultCodecParallelizesIO.testTermsSeekExact fails #14108

Open
benwtrent opened this issue Jan 7, 2025 · 4 comments
Open

TestDefaultCodecParallelizesIO.testTermsSeekExact fails #14108

benwtrent opened this issue Jan 7, 2025 · 4 comments

Comments

@benwtrent
Copy link
Member

Description

On 10x, this seed fails repeatably with trace:

TestDefaultCodecParallelizesIO > testTermsSeekExact FAILED
    java.lang.AssertionError
        at __randomizedtesting.SeedInfo.seed([188E78701153935C:1CA56CB7693DFC05]:0)
        at org.junit.Assert.fail(Assert.java:87)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at org.junit.Assert.assertTrue(Assert.java:53)
        at org.apache.lucene.index.TestDefaultCodecParallelizesIO.testTermsSeekExact(TestDefaultCodecParallelizesIO.java:89)
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
        at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
        at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
        at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
        at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)

I am going to run with many iterations on main to see if another seed reproduces there.

Gradle command to reproduce

./gradlew :lucene:core:test --tests "org.apache.lucene.index.TestDefaultCodecParallelizesIO.testTermsSeekExact" -Ptests.jvms=5 "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1" -Ptests.seed=188E78701153935C -Ptests.locale=en-KN -Ptests.timezone=Asia/Hebron -Ptests.gui=false -Ptests.file.encoding=UTF-8 -Ptests.vectorsize=512 -Ptests.forceintegervectors=true
@benwtrent
Copy link
Member Author

Running this test 100k+ times on main and it never failed.

So, I tried many thousands of other seeds on 10x and it never failed.

Seems like an exceptionally rare failure.

@benwtrent
Copy link
Member Author

git bisect says: b6512a4

Which I suppose makes sense as that was the last fix attempted for this test case :)

@jpountz
Copy link
Contributor

jpountz commented Jan 7, 2025

I think I understand the failure. Since the terms dictionary doesn't know about the length of its blocks, it always prefetches a length of 1. But if you are unlucky and your terms dictionary block spans two pages, then you get a page miss.

This failure occurs because the 5 terms that are looked up all belong to a block that spans two pages.

@jpountz
Copy link
Contributor

jpountz commented Jan 7, 2025

The correct fix would probably be to improve the terms index to record the length of blocks (there was a related discussion about whether we already have this info at #13359 (comment) but the answer is no). I have other things in progress but maybe I can look into it afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants