Skip to content

Commit

Permalink
allow setting otel sysprops via spark-submit
Browse files Browse the repository at this point in the history
  • Loading branch information
barend-xebia committed Dec 2, 2024
1 parent df950a4 commit 9b5bb64
Show file tree
Hide file tree
Showing 3 changed files with 44 additions and 2 deletions.
23 changes: 22 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,12 +61,33 @@ If you're using Spark on top of Kubernetes, you should install and configure the

The automatic configuration is controlled by a set of environment variables or JVM system properties. These are documented here: [configuration][otel-config].

#### As Environment Variables

Use any mechanism of choice, such as shell exports:

```bash
export OTEL_TRACES_EXPORTER=zipkin
export OTEL_EXPORTER_ZIPKIN_ENDPOINT=http://localhost:9411/api/v2/spans
```

Note: if you use the Kubernetes Operator, these variables are controlled there.
Note: if you use the Kubernetes Operator, these environment variables are controlled there.

#### As JVM System Properties

Besides all the standard ways, JVM system properties can also be passed to Spot via the spark-submit command:

```diff
SPARK_VERSION=3.5
SCALA_VERSION=2.12
spark-submit \
--jar com.xebia.data.spot.spot-complete-${SPARK_VERSION}_${SCALA_VERSION}-x.y.z.jar \
--conf spark.extraListeners=com.xebia.data.spot.TelemetrySparkListener \
+ --conf spark.otel.traces.exporter=zipkin \
+ --conf spark.exporter.zipkin.endpoint=http://localhost:9411/api/v2/spans \
com.example.MySparkJob
```

All options starting with `spark.otel` are so exposed. Note: existing values are overwritten.

### Configuring OpenTelemetry SDK Manually

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ import java.util.concurrent.TimeUnit
/**
* Uses OpenTelemetry Autoconfigure to build an OpenTelemetry SDK.
*
* Any SparkConf properties that start with `spark.otel` (such as `spark.otel.service.name`) are exposed as JVM system
* properties (sans `spark.` prefix). This allows otel configuration (see link below) to be included as `--conf` args
* to spark-submit.
*
* To configure the autoconf SDK, see [[https://opentelemetry.io/docs/languages/java/configuration/]]. If you're on
* Kubernetes, have a look at the OpenTelemetry Operator.
*/
Expand All @@ -19,6 +23,18 @@ class SdkProvider extends OpenTelemetrySdkProvider {

override def get(config: Map[String, String]): OpenTelemetrySdk = {
logger.info("Using AutoConfigured OpenTelemetry SDK.")
config.foreach {
case (k,v) if k.startsWith("spark.otel") =>
val otelProperty = k.substring(6)
sys.props.get(otelProperty) match {
case Some(old) =>
logger.info(s"Replacing '$otelProperty' in JVM system properties, changing it from '$old' to '$v'.")
case None =>
logger.info(s"Adding '$otelProperty' to JVM system properties as '$v'.")
}
sys.props.put(otelProperty, v)
case _ =>
}
val sdk = AutoConfiguredOpenTelemetrySdk.initialize().getOpenTelemetrySdk

sys.addShutdownHook {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,14 @@ class AutoconfiguredOpenTelemetrySdkProviderTest extends AnyFlatSpec with should
val uh = new TestOpenTelemetrySupport()
// TODO improve verification;
uh.openTelemetry should not be (null)
uh.openTelemetry.toString should matchPattern {
case s: String if s.contains("attributes={service.name=\"this is a test\"") =>
}
}
}

private[this] class TestOpenTelemetrySupport extends OpenTelemetrySupport {
override def spotConfig: Map[String, String] = Map.empty
override def spotConfig: Map[String, String] = Map(
"spark.otel.service.name" -> "this is a test"
)
}

0 comments on commit 9b5bb64

Please sign in to comment.