Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data chunk and task result enums and dtos #2442

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

inv-jishnu
Copy link
Contributor

@inv-jishnu inv-jishnu commented Dec 24, 2024

Description

This PR adds enum and dtos used in data import for data chunks ssupport (smaller partitions of import data - data chunk details, status and state enums). And the dto for enums for import tasks and import result.
This also contains enums and dtos for transaction batch (to store transaction batch details, transaction batch status and result)

Related issues and/or PRs

NA

Changes made

I have added data chunk and task result enums and dtos for data loader import.

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

Road map to merge remaining data loader core files. Current status

Release notes

NA

@inv-jishnu inv-jishnu added the enhancement New feature or request label Dec 24, 2024
@inv-jishnu inv-jishnu marked this pull request as draft December 24, 2024 10:56
@inv-jishnu inv-jishnu mentioned this pull request Jan 6, 2025
6 tasks
@inv-jishnu inv-jishnu self-assigned this Jan 7, 2025
@ypeckstadt ypeckstadt marked this pull request as ready for review January 8, 2025 23:45
@Builder
@Value
@JsonDeserialize(builder = ImportTaskResult.ImportTaskResultBuilder.class)
public class ImportTaskResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly speaking, it's hard to me to get the relationship of these classes and how to use them from this PR. It would be great if there is a simple diagram or something to show the relationship.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my previous comment might be unclear. I just read the design document again. The definition of the following terms are still unclear to me, so adding comments about the definitions of xxxxx would be enough:

  • batch (I guess ImportTransactionBatch is related)
  • task (I guess ImportTaskAction and ImportTaskResult are related)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@komamitsu san,
Sorry for the late update, I will provide the general idea for batch amd task.

The import file is split into smaller data chunks for import initially. Each data chunk when the import is in transaction mode is further split into transaction batches. This is indicated as a batch .
In each transaction batch, each individual row of data is imported by a transaction. The process of importing each individual row via a transaction is termed as a task.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@inv-jishnu Thanks for the explanation! I think the information will be very helpful for future code reviewers and maintainers. Could you add the description to the design doc or as source code comments?

Copy link
Contributor Author

@inv-jishnu inv-jishnu Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@komamitsu san,
I will check with @ypeckstadt and add these in the design doc.
Thank you.

@inv-jishnu inv-jishnu requested a review from komamitsu January 23, 2025 11:22
Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

Copy link
Contributor

@komamitsu komamitsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

@inv-jishnu Let me know when you address #2442 (comment)

Copy link
Contributor

@Torch3333 Torch3333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@@ -0,0 +1,8 @@
package com.scalar.db.dataloader.core.dataimport.datachunk;

/** * Status of the import data chunk which during the import process */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/** * Status of the import data chunk which during the import process */
/** * Status of the import data chunk during the import process */

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants