Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#5200]Update the Column data type for python client #5354

Merged
merged 15 commits into from
Nov 6, 2024

Conversation

SophieTech88
Copy link
Contributor

@SophieTech88 SophieTech88 commented Oct 30, 2024

What changes were proposed in this pull request?

  1. Implement type.java as type.py in Python client
  2. Implement types.java as types.py in Python client
  3. Create a test_types.py in unit tests.

Why are the changes needed?

We need to support the column data type in python client

Fix: #5200

Does this PR introduce any user-facing change?

No

How was this patch tested?

Need to add the unit tests

@SophieTech88 SophieTech88 marked this pull request as draft October 30, 2024 03:13
@SophieTech88 SophieTech88 marked this pull request as ready for review October 31, 2024 06:15
@jerryshao
Copy link
Contributor

@mchades can you please help to review?

@jerryshao
Copy link
Contributor

@noidname01 , would you please also help to review this PR? Thanks a lot.

@SophieTech88
Copy link
Contributor Author

SophieTech88 commented Nov 4, 2024

When I double check the types folder (gravitino/tree/main/api/src/main/java/org/apache/gravitino/rel/types/) I find it has another file Decimal.java, so just update a new commit to convert decimal.java to decimal.py with unit test in python client folder.

Comment on lines 34 to 38
@classmethod
def get(cls):
if cls._instance is None:
cls._instance = cls()
return cls._instance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if these codes can guarantee the singleton of _instance under race conditions in Python, can you help confirm it? @xunliu @noidname01

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching that! You’re absolutely correct—this implementation does not currently handle race conditions, and under high concurrency, there’s a chance multiple threads could attempt to create the singleton instance simultaneously, breaking the singleton pattern.

Solution: Adding Thread Safety with a Lock
To handle this, we can use threading.Lock, which ensures that only one thread can enter the critical section where the instance is created.

Here is the example:

class template:
    _instance = None
    _lock = threading.Lock()  # A lock to ensure thread safety

    @classmethod
    def get(cls):
        with cls._lock:  # Ensures only one thread can access this block at a time
            if cls._instance is None:
                cls._instance = cls()
        return cls._instance

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lock makes things more complex, is there another simple common method?

Copy link
Contributor Author

@SophieTech88 SophieTech88 Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to use __new__():

class Types:

    class NullType(Type):
        _instance = None

        @classmethod
        def get(cls):
            if cls._instance is None:
                cls._instance = cls.__new__(cls)
            return cls._instance

logic here:

  • get() checks if _instance is None.
  • If _instance is None, cls.new(cls) is called to create a new instance of NullType without directly calling the constructor init.
  • get() then returns the instance. Subsequent calls to get() will return the existing _instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Python support singleton?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Python does support the Singleton pattern, which ensures that only one instance of a class is created. While Python does not enforce singletons natively, you can implement it in several ways.

Copy link
Member

@xunliu xunliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @SophieTech88
I help you improve this PR, please help me review my change parts.

  1. We can use __new__() implement singleton.
  2. Added some return type and function param type.
  3. Added some class and function comments.

@SophieTech88
Copy link
Contributor Author

hi @SophieTech88 I help you improve this PR, please help me review my change parts.

  1. We can use __new__() implement singleton.
  2. Added some return type and function param type.
  3. Added some class and function comments.

@xunliu Thank you so much. It is a great help.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the codes for?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mchades The clients/client-python/tests/unittests/test_gvfs_with_local.py report waring C0302: Too many lines in module (1102/1100) (too-many-lines), So I fixed it.

mchades
mchades previously approved these changes Nov 6, 2024
Copy link
Contributor

@mchades mchades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type definition logic LGTM

@noidname01
Copy link
Collaborator

For Singleton Implementation in Python, as I used in implementing Error Handlers before, there's another approach we can consider, we can initialize the instance in separate module, and import wherever needed. This ensure the instance is Singleton.

@mchades mchades dismissed their stale review November 6, 2024 13:04

docs need to be updated

Copy link
Member

@xunliu xunliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SophieTech88 Thank you for your contributions.
LGTM

@mchades mchades merged commit fabdbc9 into apache:main Nov 6, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] Support Column data type
6 participants