Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate entries in app insights #1214

Open
DSilvadev opened this issue Jul 31, 2023 · 9 comments
Open

Duplicate entries in app insights #1214

DSilvadev opened this issue Jul 31, 2023 · 9 comments
Labels
azure Microsoft Azure bug

Comments

@DSilvadev
Copy link

Hi everyone, I'm getting duplicate logs with the exact timestamp in app insights, as the screenshot below shows:

image

The code I'm using:

import logging
from opencensus.ext.azure.log_exporter import AzureLogHandler
import sys


_logger = None
      
_logger = logging.getLogger('test')

if not _logger.handlers:

    formatter = logging.Formatter('%(message)s')

    azure_handler = AzureLogHandler(
        connection_string="<my_key>")

    azure_handler.setFormatter(formatter)

    _logger.addHandler(azure_handler)

    std_out_handler = logging.StreamHandler(sys.stdout)
    std_out_handler.setLevel(logging.DEBUG)

    _logger.addHandler(std_out_handler)

_logger.setLevel(logging.INFO)

_logger.propagate = False
_logger.root.propagate = False


properties = {'custom_dimensions': {'group': 'app_info'}}
_logger.info('test message', extra=properties)

I only get one message with the handler std_out_handler and when I sent message to the _logger the message is duplicated.

Any ideas how to solve / reason for this ?

Python version: 3.10
Opencensus version: 1.1.9

Thanks !

@DSilvadev DSilvadev added the bug label Jul 31, 2023
@lzchen
Copy link
Contributor

lzchen commented Sep 15, 2023

Are you perhaps using a databricks notebook to run this? This error usually occurs when you are adding multiple instances of the azure log handler to your logger.

@lzchen lzchen added the azure Microsoft Azure label Sep 18, 2023
@mosopeogundipe
Copy link

I also have the exact same issue as @DSilvadev , see image of duplicate entries below which have the same timestamp. This error happens whether I use opencensus library in locally or databricks environment. And I am adding the azure log handler to my logger just once.

Screenshot 2023-10-31 at 3 40 55 PM

@lzchen
Copy link
Contributor

lzchen commented Nov 1, 2023

@mosopeogundipe

Could you paste a code snippet of how you are using the azure log handler?

@juniinacio
Copy link

juniinacio commented Nov 7, 2023

Hi, I also experience the same issue when exporting logs to Azure in a Python Azure Function, also duplicate messages are being logged in app insights:

import os
import json
import logging
from typing import List

import azure.functions as func

from opencensus.ext.azure.log_exporter import AzureLogHandler

from opencensus.ext.azure.trace_exporter import AzureExporter
from opencensus.trace import config_integration
from opencensus.trace.samplers import ProbabilitySampler
from opencensus.trace.tracer import Tracer
from opencensus.trace import execution_context
from opencensus.trace.propagation.trace_context_http_header_format import TraceContextPropagator

config_integration.trace_integrations(['requests'])
config_integration.trace_integrations(["logging"])

def callback_add_role_name(envelope):
    envelope.tags["ai.cloud.role"] = os.environ.get("WEBSITE_SITE_NAME")
    return True

exporter = AzureExporter()
exporter.add_telemetry_processor(callback_add_role_name)

handler = AzureLogHandler()
handler.add_telemetry_processor(callback_add_role_name)

logger = logging.getLogger(__name__)

logger.addHandler(handler)

from . additional_functions import *

def main(events: List[func.EventHubEvent], PSBStore: func.Out[bytes], ADXStore: func.Out[str], context: func.Context) -> None:
    
    args = get_args()

    span_context = TraceContextPropagator().from_headers({"traceparent": context.trace_context.Traceparent, "tracestate": context.trace_context.Tracestate})
    tracer = Tracer(span_context=span_context, exporter=exporter, sampler=ProbabilitySampler(1.0))
    execution_context.set_opencensus_tracer(tracer)
      
    predictions = []
    for trajectory in args:
        logger.info(f"this message will be duplicated in trace logs {trajectory}...")
        prediction = inference(events, trajectory, args[trajectory], logger, tracer)
        
        extra = {'custom_dimensions': prediction}
        logger.info(
            f"this message will also be duplicated in trace logs {trajectory}: {json.dumps(prediction)}",
            extra=extra
        )
        
        predictions.append(prediction)

Also operation id is not being set on the trace message with custom dimensions:

image

Message without custom dimension:

image

I'm new at writing python code so my apologizes with that.

@lzchen
Copy link
Contributor

lzchen commented Nov 7, 2023

Try adding:

root_logger = logging.getLogger()
for handler in root_logger.handlers[:]:
    root_logger.removeHandler(handler)

before

logger = logging.getLogger(__name__)

in your code to remove duplicate logs.

@juniinacio
Copy link

@lzchen will try it and let you know the results, thanks.

@juniinacio
Copy link

Just did some testing, and I can confirm that removing the handlers from the root logger corrected the issue with the duplicate messages for me, also correlation id etc.. works also. Only one downside I have noticed is that in this way, the custom dimensions are overridden, meaning you will lose the default custom properties set by Azure Functions as soon as you specify the custom dimensions. Once again thanks for your quick reply, really helpful.

@mosopeogundipe
Copy link

@mosopeogundipe

Could you paste a code snippet of how you are using the azure log handler?

@lzchen here's my code snippet below. The first code block contains classes I defined which I call in the 2nd code block below, so that it can send logs from my entire app to app insights.

Logger classes:

import logging
import os
import uuid
from logging import LogRecord
from typing import Any, Dict

from opencensus.ext.azure.common.protocol import Envelope
from opencensus.ext.azure.log_exporter import AzureLogHandler


class AzureAppInsightsLogger:
    logger = None

    def __init__(
        self,
        app_name: str,
        operation_group_id: str = None,
        additional_base_properties: Dict[str, Any] = None,
        log_level: int = logging.INFO,
        logger: logging.Logger = None,
        instrumentation_key: str = None,
        **options: Dict[str, Any],
    ):
        self.add_handler(app_name, operation_group_id, additional_base_properties, log_level, logger, instrumentation_key, **options)

    @classmethod
    def get_logger(cls) -> logging.Logger:
        """
        :type name: str
        :param name: name of the logger
        :return: a logger with the given name
        """
        return cls.logger

    @classmethod
    def add_handler(
        cls,
        app_name: str,
        operation_group_id: str = None,
        additional_base_properties: Dict[str, Any] = None,
        log_level: int = logging.INFO,
        logger: logging.Logger = None,
        instrumentation_key: str = None,
        **options: Dict[str, Any],
    ):
        appinsight_handler = _AzureLogHandlerWithCtx(
            app_name=app_name,
            operation_group_id=operation_group_id,
            additional_base_properties=additional_base_properties,
            instrumentation_key=instrumentation_key,
            **options,
        )
        appinsight_handler.setLevel(log_level)
        cls.logger = logger if logger else logging.getLogger('')
        cls.logger.setLevel(log_level)
        if _AzureLogHandlerWithCtx.is_live_mode():
            cls.logger.addHandler(appinsight_handler)


class _AzureLogHandlerWithCtx(AzureLogHandler):
    def __init__(
        self,
        app_name: str,
        operation_group_id: str = None,
        additional_base_properties: Dict[str, Any] = None,
        instrumentation_key: str = None,
        **options: Dict[str, Any],
    ):
        self.app_name = app_name
        self.operation_group_id = operation_group_id or uuid.uuid4().urn
        self.additional_base_properties = additional_base_properties or {}
        self._options = options or {}
        self.add_instrumentation_key(self._options, instrumentation_key)
        super().__init__(**self._options)

    @staticmethod
    def add_instrumentation_key(
        options: Dict[str, Any],
        instrumentation_key: str = None,
    ):
        if 'connection_string' not in options or options['connection_string'] is None:
            instr_key = instrumentation_key or os.getenv(LOGGER_INSTRUMENTATION_KEY)
            options['connection_string'] = f'InstrumentationKey={instr_key}'

    @staticmethod
    def is_live_mode():
        return os.getenv(LOGGER_LIVE_MODE, 'true') == 'true'

    def log_record_to_envelope(self, record: LogRecord) -> Envelope:
        """
        :type record: logging.LogRecord
        :param record: LogRecord object required by the base class 'AzureLogHandler'
        :return: opencensus.ext.azure.common.protocol.Envelope
        """
        envelope = super().log_record_to_envelope(record)

        # set this filterable value
        envelope.tags['ai.cloud.role'] = self.app_name
        if not envelope.data.baseData.properties:
            envelope.data.baseData.properties = {}
        # Fill custom properties
        for k, v in self.additional_base_properties.items():
            envelope.data.baseData.properties[k] = v

        # This lets us combine all the logs of a particular run
        if self.operation_group_id:
            envelope.tags['ai.operation.id'] = self.operation_group_id

        return envelope

Adding AzureAppInsightsLogger to root handler is done here in my main.py. This enables logs from my app to send to app insights:

AzureAppInsightsLogger.add_handler(
    app_name='my-app',
    logger=logging.getLogger(''),
    instrumentation_key='xxxxxx',
    log_level=logging.DEBUG,
)

@jorgeczaiden
Copy link

Actually, this is not a bug.

Running code in a workspace that shares the same globals can lead to this issue. For example, in an Azure Function App, two different functions may share the same logger and handler.

To avoid this, you could try adding the following before initializing the logger:

Try adding:

root_logger = logging.getLogger()
for handler in root_logger.handlers[:]:
    root_logger.removeHandler(handler)

before

logger = logging.getLogger(__name__)

in your code to remove duplicate logs.

This will help remove duplicate logs. However, it might cause issues with certain logs.

Personally, I prefer adding a condition that prevents creating another handler:

logger = logging.getLogger(__name__)
        if not any(isinstance(handler, AzureEventHandler) for handler in logger.handlers):
            logger.addHandler(
                AzureEventHandler(
                    connection_string=conn_str
                )
            )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
azure Microsoft Azure bug
Projects
None yet
Development

No branches or pull requests

5 participants