Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMORO-3365]The table is always in committing state #3366

Merged
merged 6 commits into from
Jan 23, 2025

Conversation

7hong
Copy link
Contributor

@7hong 7hong commented Dec 16, 2024

Why are the changes needed?

Close #3365.

@github-actions github-actions bot added the module:ams-server Ams server module label Dec 16, 2024
@7hong 7hong force-pushed the master branch 2 times, most recently from 1c966fc to bf7d517 Compare December 16, 2024 03:23
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 37.50000% with 5 lines in your changes missing coverage. Please review.

Project coverage is 27.48%. Comparing base (243d289) to head (3fcc6ac).
Report is 15 commits behind head on master.

Files with missing lines Patch % Lines
...pache/amoro/server/optimizing/OptimizingQueue.java 37.50% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #3366      +/-   ##
============================================
+ Coverage     21.59%   27.48%   +5.89%     
- Complexity     2309     3511    +1202     
============================================
  Files           426      593     +167     
  Lines         39719    48310    +8591     
  Branches       5624     6234     +610     
============================================
+ Hits           8577    13280    +4703     
- Misses        30414    34096    +3682     
- Partials        728      934     +206     
Flag Coverage Δ
core 27.48% <37.50%> (?)
trino ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zhoujinsong
Copy link
Contributor

Thanks for the contribution!
Here are my thoughts:

  • The commits on the Iceberg table and the commits in the database are two separate processes. This issue is somewhat similar to a distributed transaction problem.
  • If the commit on the Iceberg table is successful but the database commit fails, we should attempt to retry until the database write is successful.
  • We should avoid making duplicate commits on the Iceberg table, as some repeated commits may lead to data duplication issues.

So the question arises:

  • when re-entering the commit process, how do we determine whether the commit on Iceberg has already been completed?

I am afraid the current implementation will fail when the AMS restart.

Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jan 20, 2025
@zhoujinsong
Copy link
Contributor

Hi,
A possible resolution is to attempt to close processes that are in a committing state during AMS restart process, in order to avoid duplicate commit on the table. This modification requires minimal changes to the code, although it may waste already completed files. However, the triggering probability is low, and I think it is a viable solution.

@github-actions github-actions bot removed the stale label Jan 22, 2025
7hong added 2 commits January 22, 2025 14:50
# Conflicts:
#	amoro-ams/src/main/java/org/apache/amoro/server/optimizing/OptimizingQueue.java
@github-actions github-actions bot added module:mixed-flink Flink moduel for Mixed Format module:mixed-spark Spark module for Mixed Format module:mixed-hive Hive moduel for Mixed Format module:ams-optimizer AMS optimizer module type:infra type:build module:ams-dashboard Ams dashboard module module:common labels Jan 22, 2025
@github-actions github-actions bot removed module:mixed-flink Flink moduel for Mixed Format module:mixed-spark Spark module for Mixed Format module:mixed-hive Hive moduel for Mixed Format module:ams-optimizer AMS optimizer module type:infra type:build module:ams-dashboard Ams dashboard module labels Jan 22, 2025
@zhoujinsong
Copy link
Contributor

Hi, some unit test cases should be adjusted due to this change, like TestDefaultOptimizingService#testReloadCompletedTask would be changed to :

  @Test
  public void testReloadCompletedTask() {
    // THREAD_ID.poll task
    OptimizingTask task = optimizingService().pollTask(token, THREAD_ID);
    Assertions.assertNotNull(task);
    optimizingService().ackTask(token, THREAD_ID, task.getTaskId());
    optimizingService().completeTask(token, buildOptimizingTaskResult(task.getTaskId()));

    reload();
    // Committing process will be closed when reloading
    Assertions.assertNull(
        tableService()
            .getRuntime(serverTableIdentifier().getId())
            .getOptimizingProcess());
    Assertions.assertEquals(
        OptimizingStatus.IDLE,
        tableService().getRuntime(serverTableIdentifier().getId()).getOptimizingStatus());
    Assertions.assertNull(optimizingService().pollTask(token, THREAD_ID));
  }

@7hong
Copy link
Contributor Author

7hong commented Jan 22, 2025

OK, thank you very much, I will modify it later.

Fix unit tests

Fix unit tests

Fix unit tests
Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Thanks for the contribution!

@zhoujinsong zhoujinsong merged commit 49989ba into apache:master Jan 23, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:ams-server Ams server module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: The table is always in commit state
4 participants