Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding the OpenTelemetry Collector configuration of openobserve-collector #117

Open
jennydaman opened this issue Jan 21, 2025 · 0 comments

Comments

@jennydaman
Copy link
Contributor

jennydaman commented Jan 21, 2025

tl;dr the values.yaml of openobserve-collector is over-complicated. A simpler solution can be achieved using the upstream OpenTelemetry collector's chart.

I am reviewing the code of the openobserve-collector and would like to ask some questions about how it works.

Currently I'm running a Kubernetes cluster with OpenObserve deployed in the monitoring namespace. Instead of using the openobserve-collector chart, I am using the upstream OpenTelemetry collector's chart with presets enabled. The setup can be achieved with a relatively concise helmfile:

repositories:
  - name: open-telemetry
    url: https://open-telemetry.github.io/opentelemetry-helm-charts

releases:
  - name: collector-agent
    namespace: monitoring
    chart: open-telemetry/opentelemetry-collector
    version: 0.111.2
    values:
      - image:
          repository: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-k8s
        mode: daemonset
        presets:
          logsCollection:
            enabled: true
          hostMetrics:
            enabled: true
          kubernetesAttributes:
            enabled: true
            extractAllPodLabels: true
            extractAllPodAnnotations: false
          kubeletMetrics:
            enabled: true
        config: &CONFIG
          receivers:
            kubeletstats:
              insecure_skip_verify: true
          exporters:
            otlp/openobserve:
              endpoint: http://openobserve.monitoring.svc:5081
              headers:
                Authorization: {{
                  printf "%s:%s"
                    (fetchSecretValue "ref+k8s://v1/Secret/monitoring/openobserve-root-user/ZO_ROOT_USER_EMAIL")
                    (fetchSecretValue "ref+k8s://v1/Secret/monitoring/openobserve-root-user/ZO_ROOT_USER_PASSWORD")
                  | b64enc | print "Basic " | quote
                }}
                organization: default
                stream-name: default
              tls:
                insecure: true
          service:
            pipelines:
              logs:
                exporters:
                  - otlp/openobserve
              metrics:
                exporters:
                  - otlp/openobserve
              traces:
                exporters:
                  - otlp/openobserve
        resources: {} # -- snip --

  - name: collector-cluster
    namespace: monitoring
    chart: open-telemetry/opentelemetry-collector
    version: 0.111.2
    values:
      - image:
          repository: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-k8s
        mode: deployment
        replicaCount: 1
        presets:
          clusterMetrics:
            enabled: true
          kubernetesEvents:
            enabled: true
        config: *CONFIG
        resources: {} # -- snip --

The helmfile.yaml defines two releases. The one called collector-agent handles log ingestion. The generated collector config is obtained with the command:

kubectl get -n monitoring configmap collector-agent-opentelemetry-collector-agent -o jsonpath='{.data.relay}'
Upstream OpenTelemetry collector generated configuration
exporters:
  debug: {}
  otlp/openobserve:
    endpoint: http://openobserve.monitoring.svc:5081
    headers:
      Authorization: Basic ZGV2QGJhYnltcmkub3JnOmNocmlzMTIzNA==
      organization: default
      stream-name: otel-chart
    tls:
      insecure: true
extensions:
  health_check:
    endpoint: ${env:MY_POD_IP}:13133
processors:
  batch: {}
  k8sattributes:
    extract:
      labels:
      - from: pod
        key_regex: (.*)
        tag_name: $$1
      metadata:
      - k8s.namespace.name
      - k8s.deployment.name
      - k8s.statefulset.name
      - k8s.daemonset.name
      - k8s.cronjob.name
      - k8s.job.name
      - k8s.node.name
      - k8s.pod.name
      - k8s.pod.uid
      - k8s.pod.start_time
    filter:
      node_from_env_var: K8S_NODE_NAME
    passthrough: false
    pod_association:
    - sources:
      - from: resource_attribute
        name: k8s.pod.ip
    - sources:
      - from: resource_attribute
        name: k8s.pod.uid
    - sources:
      - from: connection
  memory_limiter:
    check_interval: 5s
    limit_percentage: 80
    spike_limit_percentage: 25
receivers:
  filelog:
    exclude:
    - /var/log/pods/monitoring_collector-agent-opentelemetry-collector*_*/opentelemetry-collector/*.log
    include:
    - /var/log/pods/*/*/*.log
    include_file_name: false
    include_file_path: true
    operators:
    - id: container-parser
      max_log_size: 102400
      type: container
    retry_on_failure:
      enabled: true
    start_at: end
  hostmetrics:
    collection_interval: 10s
    root_path: /hostfs
    scrapers:
      cpu: null
      disk: null
      filesystem:
        exclude_fs_types:
          fs_types:
          - autofs
          - binfmt_misc
          - bpf
          - cgroup2
          - configfs
          - debugfs
          - devpts
          - devtmpfs
          - fusectl
          - hugetlbfs
          - iso9660
          - mqueue
          - nsfs
          - overlay
          - proc
          - procfs
          - pstore
          - rpc_pipefs
          - securityfs
          - selinuxfs
          - squashfs
          - sysfs
          - tracefs
          match_type: strict
        exclude_mount_points:
          match_type: regexp
          mount_points:
          - /dev/*
          - /proc/*
          - /sys/*
          - /run/k3s/containerd/*
          - /var/lib/docker/*
          - /var/lib/kubelet/*
          - /snap/*
      load: null
      memory: null
      network: null
  jaeger:
    protocols:
      grpc:
        endpoint: ${env:MY_POD_IP}:14250
      thrift_compact:
        endpoint: ${env:MY_POD_IP}:6831
      thrift_http:
        endpoint: ${env:MY_POD_IP}:14268
  kubeletstats:
    auth_type: serviceAccount
    collection_interval: 20s
    endpoint: ${env:K8S_NODE_IP}:10250
    insecure_skip_verify: true
  otlp:
    protocols:
      grpc:
        endpoint: ${env:MY_POD_IP}:4317
      http:
        endpoint: ${env:MY_POD_IP}:4318
  prometheus:
    config:
      scrape_configs:
      - job_name: opentelemetry-collector
        scrape_interval: 10s
        static_configs:
        - targets:
          - ${env:MY_POD_IP}:8888
  zipkin:
    endpoint: ${env:MY_POD_IP}:9411
service:
  extensions:
  - health_check
  pipelines:
    logs:
      exporters:
      - otlp/openobserve
      processors:
      - k8sattributes
      - memory_limiter
      - batch
      receivers:
      - otlp
      - filelog
    metrics:
      exporters:
      - otlp/openobserve
      processors:
      - k8sattributes
      - memory_limiter
      - batch
      receivers:
      - otlp
      - prometheus
      - hostmetrics
      - kubeletstats
    traces:
      exporters:
      - otlp/openobserve
      processors:
      - k8sattributes
      - memory_limiter
      - batch
      receivers:
      - otlp
      - jaeger
      - zipkin
  telemetry:
    metrics:
      address: ${env:MY_POD_IP}:8888

Here is an example log entry from OpenObserve using the above upstream OpenTelemetry collector chart:

{
  "_timestamp": 1737473264323746,
  "app": "openobserve",
  "apps_kubernetes_io_pod_index": "0",
  "body": "2025-01-21T15:27:44.323513488+00:00 INFO actix_web::middleware::logger: 172.18.0.4 \"GET /api/default/otel_chart/_values?fields=k8s_container_name&size=10&start_time=1737472364215000&end_time=1737473264215000&sql=U0VMRUNUICogRlJPTSAib3RlbF9jaGFydCIg&type=logs HTTP/1.1\" 200 250 \"-\" \"http://localhost:32020/web/logs?stream_type=logs&stream=otel_chart&period=15m&refresh=0&sql_mode=false&query=YXBwX2t1YmVybmV0ZXNfaW9fbmFtZSA9ICdjaHJpcy13b3JrZXItbWFpbnMn&type=stream_explorer&defined_schemas=user_defined_schema&org_identifier=default&quick_mode=false&show_histogram=true\" \"Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0\" 0.099962",
  "controller_revision_hash": "openobserve-69f6d688f6",
  "dropped_attributes_count": 0,
  "k8s_container_name": "openobserve",
  "k8s_container_restart_count": "1",
  "k8s_namespace_name": "monitoring",
  "k8s_node_name": "khris-worker",
  "k8s_pod_name": "openobserve-0",
  "k8s_pod_start_time": "2025-01-20T22:14:56Z",
  "k8s_pod_uid": "1c857c0a-066e-40ba-8676-6c874631f1ca",
  "k8s_statefulset_name": "openobserve",
  "log_file_path": "/var/log/pods/monitoring_openobserve-0_1c857c0a-066e-40ba-8676-6c874631f1ca/openobserve/1.log",
  "log_iostream": "stdout",
  "logtag": "F",
  "name": "openobserve",
  "severity": 0,
  "statefulset_kubernetes_io_pod_name": "openobserve-0"
}

Meanwhile, openobserve-collector's default values.yaml specifies complex routing and regular expression named capture groups to extract metadata from log file names:

# Find out which format is used by kubernetes
- type: router
id: get-format
routes:
- output: parser-docker
expr: 'body matches "^\\{"'
- output: parser-crio
expr: 'body matches "^[^ Z]+ "'
- output: parser-containerd
expr: 'body matches "^[^ Z]+Z"'
# Parse CRI-O format
- type: regex_parser
id: parser-crio
regex: "^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$"
output: extract_metadata_from_filepath
timestamp:
parse_from: attributes.time
layout_type: gotime
layout: "2006-01-02T15:04:05.999999999Z07:00"
# Parse CRI-Containerd format
- type: regex_parser
id: parser-containerd
regex: "^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$"
output: extract_metadata_from_filepath
timestamp:
parse_from: attributes.time
layout: "%Y-%m-%dT%H:%M:%S.%LZ"
# Parse Docker format
- type: json_parser
id: parser-docker
output: extract_metadata_from_filepath
timestamp:
parse_from: attributes.time
layout: "%Y-%m-%dT%H:%M:%S.%LZ"
# Extract metadata from file path
- type: regex_parser
id: extract_metadata_from_filepath
regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]{36})\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$'
parse_from: attributes["log.file.path"]
cache:
size: 128 # default maximum amount of Pods per Node is 110

Seeing that the upstream's config can produce logs with the metadata k8s_pod_name, k8s_namespace_name, etc. (via the k8sattributes processor) with a simpler config, why does openobserve-collector's values.yaml have these regexes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant