Skip to content

Commit

Permalink
Fix performance regression in split by avoiding allocating substring …
Browse files Browse the repository at this point in the history
…per char (#237)

This PR fixes a performance regression from #227 /
4c85bde which I overlooked in review:

When generalizing the optimized non-Pattern-based split code, that
commit introduced a `.substring()` on each character, producing tons of
garbage.

Instead, I think we can do a `.startsWith(splitPattern, i)`: this should
be much faster because it will avoid unnecessary garbage string creation
(plus I'm pretty sure that `startsWith` is optimized in modern JDKs).

I also removed the use of `breakable` and replaced it with an update to
the `while` condition.
  • Loading branch information
JoshRosen authored Dec 12, 2024
1 parent f5bf391 commit 680b1a8
Showing 1 changed file with 9 additions and 12 deletions.
21 changes: 9 additions & 12 deletions sjsonnet/src/sjsonnet/Std.scala
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ import sjsonnet.Expr.Member.Visibility

import scala.collection.Searching._
import scala.collection.mutable
import scala.util.control.Breaks.{break, breakable}
import scala.util.matching.Regex

/**
Expand Down Expand Up @@ -642,17 +641,15 @@ class Std {
var sz = 0
var i = 0
var start = 0
breakable {
while (i < str.length) {
if (maxSplits >= 0 && sz >= maxSplits) {
break
}
if (str.slice(i, i + cStr.length) == cStr) {
val finalStr = Val.Str(pos, str.slice(start, i))
b.+=(finalStr)
start = i + cStr.length
sz += 1
}

while (i <= str.length - cStr.length && (maxSplits < 0 || sz < maxSplits)) {
if (str.startsWith(cStr, i)) {
val finalStr = Val.Str(pos, str.substring(start, i))
b.+=(finalStr)
start = i + cStr.length
sz += 1
i += cStr.length
} else {
i += 1
}
}
Expand Down

0 comments on commit 680b1a8

Please sign in to comment.