Skip to content

Commit

Permalink
Memory usage optimization: discard compute closure after evaluation i…
Browse files Browse the repository at this point in the history
…n Lazy (#214)

This PR implements a memory usage optimization related to `Lazy`: once
we have evaluated a Lazy value, we no longer need to retain the closure
that performed the evaluation. Discarding closures (and, transitively,
their dependencies) after lazy evaluation saves memory.

## Motivation and discovery of this issue

We have a single jsonnet file which has very high peak memory
requirements during evaluation. I captured a heap dump and noticed
significant memory usage due to `sjsonnet.Lazy[]`, with over half the
shallow size consumed by `Lazy[]` arrays:


![image](https://github.com/user-attachments/assets/b09f5011-a81a-40fe-af66-cf364f7456de)

From the merged paths view, we see that most of these are retained from
anonymous classes:


![image](https://github.com/user-attachments/assets/5d8dd12a-e0d1-4472-8cf7-0117f8baf030)

For example, here `sjsonnet.Evaluator$$anonfun$visitAsLazy$2`
corresponds to the `() => visitExpr(e)` in
https://github.com/databricks/sjsonnet/blob/759cea713bc8e39f45b844a75358fd780f75d480/sjsonnet/src/sjsonnet/Evaluator.scala#L81-L84

That is defining an anonymous implementation of
[sjsonnet.Lazy](https://github.com/databricks/sjsonnet/blob/759cea713bc8e39f45b844a75358fd780f75d480/sjsonnet/src/sjsonnet/Val.scala#L19-L26)
using the [single abstract method
type](https://bargsten.org/scala/anonymous-class-and-sam/) syntax.

Here,
[visitExpr](https://github.com/databricks/sjsonnet/blob/759cea713bc8e39f45b844a75358fd780f75d480/sjsonnet/src/sjsonnet/Val.scala#L551-L552)
takes an implicit
[ValScope](https://github.com/databricks/sjsonnet/blob/759cea713bc8e39f45b844a75358fd780f75d480/sjsonnet/src/sjsonnet/ValScope.scala#L5-L20)
parameter. ValScope is a [value
class](https://docs.scala-lang.org/overviews/core/value-classes.html),
wrapping a `bindings: Array[Lazy]`.

Thus, our `sjsonnet.Evaluator$$anonfun$visitAsLazy$2` anonymous class
captures the values needed to evaluate the `visirExpr(e)` closure,
capturing the `bindings` array and thereby contributing to the high
count of retained `Array[Lazy]`. We can also see this from the object
inspection in the heap dump:


![image](https://github.com/user-attachments/assets/7469bfd1-4c08-4a21-ab20-661614abf7dd)

## Insight behind the optimization: we don't need the closure after
evaluation

In the heap dump that I looked at, most of these anonymous Lazy
instances had non-null `cached` fields, meaning that their lazy values
had been computed. At this point the value will not be recomputed so we
can discard the closure, and, transitively discard its heavyweight
bindings, which in turn reference more closures, and bindings, and so
on.

I also draw inspiration (but not implementation) from a neat behavior of
Scala `lazy val`s where class constructor parameters that are used
exclusively in `lazy val` are discarded after successful lazy
evaluation. For instance, the code

```scala
class Lazy(f: () => String) {
  lazy val value = f()
}
```

decompiles (via [cfr](https://www.benf.org/other/cfr/)) into:

```java
public class Lazy {
    private String value;
    private Function0<String> f;
    private volatile boolean bitmap$0;

    private String value$lzycompute() {
        Lazy lazy = this;
        synchronized (lazy) {
            if (!this.bitmap$0) {
                this.value = (String)this.f.apply();
                this.bitmap$0 = true;
            }
        }
        this.f = null;
        return this.value;
    }

    public String value() {
        if (!this.bitmap$0) {
            return this.value$lzycompute();
        }
        return this.value;
    }

    public Lazy(Function0<String> f) {
        this.f = f;
    }
}
```

demonstrating how the closure is discarded after lazy evaluation.

## Implementation

This PR implements a similar optimization. I introduce a new class `
LazyWithComputeFunc(@volatile private[this] var computeFunc: () => Val)
extends Lazy` which can be used in place of the anonymous classes and
which discards the closure after evaluation.

The underlying implementation is slightly tricky because of a few
performance considerations:

- It's really important that we optimize performance for the case where
a value is computed once and read many times. Previously, commit
d26e9db
previously took pains to ensure that `force` was monomorphic, so I
couldn't change that.
- Similarly, I don't to introduce any use of locks synchronization, nor
any volatile accesses on hot paths.
- Finally, I must ensure that the code is thread safe: it's acceptable
to redundantly compute a value if we have concurrent initialization, but
we can't NPE or otherwise crash.

Here, I have chosen to make `computeFunc` a volatile field and check it
inside of `compute()`. In ordinary cases, `compute()` will only be
called once because `force` checks whether `cached` has already been
computed. In the rare case of concurrent calls to `compute()`, we first
check whether `computeFunc` has been nulled: if it is null then some
other thread computed and cached a value (assigned from within
`compute()` itself) and that other thread's write is guaranteed to be
visible to the race-condition-losing thread because the volatile read of
`computeFunc` provides piggybacked visibility of writes from the other
racing thread (see https://stackoverflow.com/a/8769692).

## Testing

This passes existing unit tests. I did some manual heap dump tests to
validate that this cuts memory usage on small toy examples. I have not
yet run this end-to-end on the real workload which generated the
original heap dumps.
  • Loading branch information
JoshRosen authored Dec 3, 2024
1 parent d5326c6 commit 44abc72
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 8 deletions.
10 changes: 5 additions & 5 deletions sjsonnet/src/sjsonnet/Evaluator.scala
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ class Evaluator(resolver: CachedResolver,

def visitAsLazy(e: Expr)(implicit scope: ValScope): Lazy = e match {
case v: Val => v
case e => () => visitExpr(e)
case e => new LazyWithComputeFunc(() => visitExpr(e))
}

def visitValidId(e: ValidId)(implicit scope: ValScope): Val = {
Expand All @@ -104,7 +104,7 @@ class Evaluator(resolver: CachedResolver,
val b = bindings(i)
newScope.bindings(base+i) = b.args match {
case null => visitAsLazy(b.rhs)(newScope)
case argSpec => () => visitMethod(b.rhs, argSpec, b.pos)(newScope)
case argSpec => new LazyWithComputeFunc(() => visitMethod(b.rhs, argSpec, b.pos)(newScope))
}
i += 1
}
Expand Down Expand Up @@ -490,9 +490,9 @@ class Evaluator(resolver: CachedResolver,
val b = bindings(i)
arrF(i) = b.args match {
case null =>
(self: Val.Obj, sup: Val.Obj) => () => visitExpr(b.rhs)(scope(self, sup))
(self: Val.Obj, sup: Val.Obj) => new LazyWithComputeFunc(() => visitExpr(b.rhs)(scope(self, sup)))
case argSpec =>
(self: Val.Obj, sup: Val.Obj) => () => visitMethod(b.rhs, argSpec, b.pos)(scope(self, sup))
(self: Val.Obj, sup: Val.Obj) => new LazyWithComputeFunc(() => visitMethod(b.rhs, argSpec, b.pos)(scope(self, sup)))
}
i += 1
}
Expand Down Expand Up @@ -564,7 +564,7 @@ class Evaluator(resolver: CachedResolver,
case null =>
visitAsLazy(b.rhs)(newScope)
case argSpec =>
() => visitMethod(b.rhs, argSpec, b.pos)(newScope)
new LazyWithComputeFunc(() => visitMethod(b.rhs, argSpec, b.pos)(newScope))
}
i += 1
j += 1
Expand Down
2 changes: 1 addition & 1 deletion sjsonnet/src/sjsonnet/Materializer.scala
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ abstract class Materializer {
case ujson.Null => Val.Null(pos)
case ujson.Num(n) => Val.Num(pos, n)
case ujson.Str(s) => Val.Str(pos, s)
case ujson.Arr(xs) => new Val.Arr(pos, xs.map(x => (() => reverse(pos, x)): Lazy).toArray[Lazy])
case ujson.Arr(xs) => new Val.Arr(pos, xs.map(x => new LazyWithComputeFunc(() => reverse(pos, x))).toArray[Lazy])
case ujson.Obj(xs) =>
val builder = new java.util.LinkedHashMap[String, Val.Obj.Member]
for(x <- xs) {
Expand Down
22 changes: 20 additions & 2 deletions sjsonnet/src/sjsonnet/Val.scala
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ import scala.reflect.ClassTag
* evaluated dictionary values, array contents, or function parameters
* are all wrapped in [[Lazy]] and only truly evaluated on-demand
*/
abstract class Lazy {
protected abstract class Lazy {
protected[this] var cached: Val = null
def compute(): Val
final def force: Val = {
Expand All @@ -25,6 +25,24 @@ abstract class Lazy {
}
}

/**
* Thread-safe implementation that discards the compute function after initialization.
*/
final class LazyWithComputeFunc(@volatile private[this] var computeFunc: () => Val) extends Lazy {
def compute(): Val = {
val f = computeFunc
if (f != null) { // we won the race to initialize
val result = f()
cached = result
computeFunc = null // allow closure to be GC'd
}
// else: we lost the race to compute, but `cached` is already set and
// is visible in this thread due to the volatile read and writes via
// piggybacking; see https://stackoverflow.com/a/8769692 for background
cached
}
}

/**
* [[Val]]s represented Jsonnet values that are the result of evaluating
* a Jsonnet program. The [[Val]] data structure is essentially a JSON tree,
Expand Down Expand Up @@ -395,7 +413,7 @@ object Val{
if(argVals(j) == null) {
val default = params.defaultExprs(i)
if(default != null) {
argVals(j) = () => evalDefault(default, newScope, ev)
argVals(j) = new LazyWithComputeFunc(() => evalDefault(default, newScope, ev))
} else {
if(missing == null) missing = new ArrayBuffer
missing.+=(params.names(i))
Expand Down

0 comments on commit 44abc72

Please sign in to comment.