Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fetch_access_token_for_cluster in EKS hook #45469

Merged
merged 2 commits into from
Jan 8, 2025

Conversation

vincbeck
Copy link
Contributor

@vincbeck vincbeck commented Jan 7, 2025

Resolves #45368.

Some users in China reported an issue when integrating EKS with Airflow. The customer could not use the workflow under global account after migrating it to China. Their dag script to connect to EKS cluster using EksPodOperator returned 401 .

Error:

[2024-12-31, 08:47:19 UTC] {taskinstance.py:3310} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py", line 767, in _execute_task
    result = _execute_callable(context=context, **execute_callable_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py", line 733, in _execute_callable
    return ExecutionCallableRunner(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/utils/operator_helpers.py", line 252, in run
    return self.func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/baseoperator.py", line 406, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/operators/eks.py", line 1103, in execute
    return super().execute(context)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/baseoperator.py", line 406, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 593, in execute
    return self.execute_sync(context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 603, in execute_sync
    self.pod = self.get_or_create_pod(  # must set `self.pod` for `on_kill`
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 561, in get_or_create_pod
    pod = self.find_pod(self.namespace or pod_request_obj.metadata.namespace, context=context)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 534, in find_pod
    pod_list = self.client.list_namespaced_pod(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api/core_v1_api.py", line 15823, in list_namespaced_pod
    return self.list_namespaced_pod_with_http_info(namespace, **kwargs)  # noqa: E501
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api/core_v1_api.py", line 15942, in list_namespaced_pod_with_http_info
    return self.api_client.call_api(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 348, in call_api
    return self.__call_api(resource_path, method,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
    response_data = self.request(
                    ^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 373, in request
    return self.rest_client.GET(url,
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 244, in GET
    return self.request("GET", url,
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 238, in request
    raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (401)
Reason: Unauthorized

The root cause is the address to access STS here points to the global address, not the China STS service address. So the eks token obtained cannot be used in China. The sts_url that should be used in China is https://sts/.{session.region_name}.[amazonaws.com.cn/?Action=GetCallerIdentity&Version=2011-06-15


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:providers provider:amazon AWS/Amazon - related issues labels Jan 7, 2025
Copy link
Contributor

@o-nikolas o-nikolas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any way to unit test this edge case of switching regions/accounts like that?

@o-nikolas
Copy link
Contributor

Looks like some of the existing tests need updating first as well

@eladkal
Copy link
Contributor

eladkal commented Jan 7, 2025

Does this fix solve #45368 ?

@vincbeck
Copy link
Contributor Author

vincbeck commented Jan 7, 2025

Does this fix solve #45368 ?

Oh yeah! I did not know there was a ticket for it. I updated the description to mention it

@vincbeck vincbeck merged commit dc3111a into apache:main Jan 8, 2025
61 checks passed
@vincbeck vincbeck deleted the vincbeck/eks branch January 8, 2025 15:31
dstandish pushed a commit to astronomer/airflow that referenced this pull request Jan 8, 2025
vincbeck added a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Jan 9, 2025
potiuk pushed a commit that referenced this pull request Jan 9, 2025
* Revert "Fix the way to get STS endpoint in EKS hook (#45520)"

This reverts commit 103df61.

* Revert "Fix `fetch_access_token_for_cluster` in EKS hook (#45469)"

This reverts commit dc3111a.
agupta01 pushed a commit to agupta01/airflow that referenced this pull request Jan 13, 2025
agupta01 pushed a commit to agupta01/airflow that referenced this pull request Jan 13, 2025
* Revert "Fix the way to get STS endpoint in EKS hook (apache#45520)"

This reverts commit 103df61.

* Revert "Fix `fetch_access_token_for_cluster` in EKS hook (apache#45469)"

This reverts commit dc3111a.
karenbraganz pushed a commit to karenbraganz/airflow that referenced this pull request Jan 13, 2025
* Revert "Fix the way to get STS endpoint in EKS hook (apache#45520)"

This reverts commit 103df61.

* Revert "Fix `fetch_access_token_for_cluster` in EKS hook (apache#45469)"

This reverts commit dc3111a.
HariGS-DB pushed a commit to HariGS-DB/airflow that referenced this pull request Jan 16, 2025
HariGS-DB pushed a commit to HariGS-DB/airflow that referenced this pull request Jan 16, 2025
* Revert "Fix the way to get STS endpoint in EKS hook (apache#45520)"

This reverts commit 103df61.

* Revert "Fix `fetch_access_token_for_cluster` in EKS hook (apache#45469)"

This reverts commit dc3111a.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:amazon AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

There is a bug in connecting to EKS using the airflow.providers.amazon.aws.operators.eks library in China.
3 participants