Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using the 'identifier' parameter in TaskLogHandler #57

Open
rafidka opened this issue May 24, 2024 · 0 comments
Open

Consider using the 'identifier' parameter in TaskLogHandler #57

rafidka opened this issue May 24, 2024 · 0 comments

Comments

@rafidka
Copy link
Contributor

rafidka commented May 24, 2024

Overview

Currently, we are not making use of the identifier parameter in our TaskLogHandler:

def set_context(self, ti: TaskInstance, *, identifier: str | None = None) -> None:
    """
    Provide context to the logger.

    This method is called by Airflow to provide the necessary context to configure
    the handler. In this case, Airflow is passing us the task instance the logs
    are for.

    :param ti: The task instance generating the logs.
    :param identifier: Airflow uses this when relaying exceptional messages to task
    logs from a context other than task itself. We ignore this parameter in this
    handler, i.e. those exceptional messages will go to the same log stream.
    """

    logs_client: CloudWatchLogsClient = boto3.client("logs")  # type: ignore

    if self.enabled:
        # identical to open-source implementation, except create_log_group set to False
        self.handler = watchtower.CloudWatchLogHandler(
            log_group_name=self.log_group_name,
            log_stream_name=self._render_filename(ti, ti.try_number),  # type: ignore
            boto3_client=logs_client,
            use_queues=True,
            create_log_group=False,
        )

        if self.formatter:
            self.handler.setFormatter(self.formatter)
    else:
        self.handler = None

We can potentially use it to log those exceptional messages into a separate task log, making them easier to discover. Searching Airflow code base, I identified the following logs:

It is probably useful to have those messages in a separate log stream so they can be easily detected, but I would like us to a do some investigation on this before making a decision.

Acceptance Criteria

  • Determine whether it is beneficial to use the 'identifier' parameter.
  • If yes, make the necessary code change. Otherwise, document the finding and resolve this issue.

Additional Info

N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant