Skip to content

Commit

Permalink
Merge pull request #55 from subnova-etsy/clearfrequencysec
Browse files Browse the repository at this point in the history
Add support for specifying the ClearFrequencySec attribute
  • Loading branch information
tredman authored Apr 10, 2020
2 parents 8bf9e75 + 8cb6360 commit 44700e2
Show file tree
Hide file tree
Showing 4 changed files with 24 additions and 2 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ The redis host should be a hostname and a port, for example `redis.mydomain.com:

## How sampling decisions are made

In the configuration file, there is a place to choose a sampling method and some options for each. The `DynamicSampler` is the most interesting and most commonly used, so that's the one that gets described here. It uses the `AvgSampleRate` algorithm from the [`dynsampler-go`](https://github.com/honeycombio/dynsampler-go) package. Briefly described, you configure samproxy to examine the trace for a set of fields (for example, `request.status_code` and `request.method`). It collects all the values found in those fields anywhere in the trace (eg "200" and "GET") together in to a key it hands to the dynsampler. The dynsampler code will look at the frequency that key appears during the previous 30 seconds and use that to hand back a desired sample rate. More frequent keys are sampled more heavily, so that an even distribution of traffic across the keyspace is represented in Honeycomb.
In the configuration file, there is a place to choose a sampling method and some options for each. The `DynamicSampler` is the most interesting and most commonly used, so that's the one that gets described here. It uses the `AvgSampleRate` algorithm from the [`dynsampler-go`](https://github.com/honeycombio/dynsampler-go) package. Briefly described, you configure samproxy to examine the trace for a set of fields (for example, `request.status_code` and `request.method`). It collects all the values found in those fields anywhere in the trace (eg "200" and "GET") together in to a key it hands to the dynsampler. The dynsampler code will look at the frequency that key appears during the previous 30 seconds (or other value set by the `ClearFrequencySec` setting) and use that to hand back a desired sample rate. More frequent keys are sampled more heavily, so that an even distribution of traffic across the keyspace is represented in Honeycomb.

By selecting fields well, you can drop significant amounts of traffic while still retaining good visibility into the areas of traffic that interest you. For example, if you want to make sure you have a complete list of all URL handlers invoked, you would add the URL (or a normalized form) as one of the fields to include. Be careful in your selection though, because if the combination of fields cretes a unique key each time, you won't sample out any traffic. Because of this it is not effective to use fields that have unique values (like a UUID) as one of the sampling fields. Each field included should ideally have values that appear many times within any given 30 second window in order to effectively turn in to a sample rate.

Expand Down
8 changes: 7 additions & 1 deletion config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ CacheCapacity = 1000
# implementation. This sampler collects the values of a number of fields from a
# trace and uses them to form a key. This key is handed to the standard dynamic
# sampler algorithm which generates a sample rate based on the frequency with
# which that key has appeared in the previous 30 seconds. See
# which that key has appeared in the previous ClearFrequencySec seconds. See
# https://github.com/honeycombio/dynsampler-go for more detail on the mechanics
# of the dynamic sampler. This sampler uses the AvgSampleRate algorithm from
# that package.
Expand Down Expand Up @@ -219,6 +219,12 @@ CacheCapacity = 1000
# AddSampleRateKeyToTrace is true.
AddSampleRateKeyToTraceField = "meta.samproxy.dynsampler_key"

# ClearFrequencySec is the name of the field the sampler will use to determine
# the period over which it will calculate the sample rate. This setting defaults
# to 30.
# Eligible for live reload.
ClearFrequencySec = 60

[SamplerConfig.dataset2]

Sampler = "DeterministicSampler"
Expand Down
1 change: 1 addition & 0 deletions config/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ func TestGetSamplerTypes(t *testing.T) {
UseTraceLength = true
AddSampleRateKeyToTrace = true
AddSampleRateKeyToTraceField = "meta.samproxy.dynsampler_key"
ClearFrequencySec = 60
[SamplerConfig.dataset2]
Expand Down
15 changes: 15 additions & 0 deletions sample/dynamic.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ type DynamicSampler struct {
Metrics metrics.Metrics

sampleRate int64
clearFrequencySec int64
fieldList []string
useTraceLength bool
addDynsampleKey bool
Expand All @@ -31,6 +32,7 @@ type DynamicSampler struct {

type DynSamplerConfig struct {
SampleRate int64
ClearFrequencySec int64
FieldList []string
UseTraceLength bool
AddSampleRateKeyToTrace bool
Expand All @@ -51,6 +53,10 @@ func (d *DynamicSampler) Start() error {
dsConfig.SampleRate = 1
}
d.sampleRate = dsConfig.SampleRate
if dsConfig.ClearFrequencySec == 0 {
dsConfig.ClearFrequencySec = 30
}
d.clearFrequencySec = dsConfig.ClearFrequencySec

// get list of fields to use when constructing the dynsampler key
fieldList := dsConfig.FieldList
Expand All @@ -67,6 +73,7 @@ func (d *DynamicSampler) Start() error {
// spin up the actual dynamic sampler
d.dynsampler = &dynsampler.AvgSampleRate{
GoalSampleRate: int(d.sampleRate),
ClearFrequencySec: int(d.clearFrequencySec),
}
d.dynsampler.Start()

Expand Down Expand Up @@ -99,6 +106,13 @@ func (d *DynamicSampler) reloadConfigs() {
configChanged = true
d.sampleRate = dsConfig.SampleRate
}
if dsConfig.ClearFrequencySec == 0 {
dsConfig.ClearFrequencySec = 30
}
if d.clearFrequencySec != dsConfig.ClearFrequencySec {
configChanged = true
d.clearFrequencySec = dsConfig.ClearFrequencySec
}

// get list of fields to use when constructing the dynsampler key
fieldList := dsConfig.FieldList
Expand Down Expand Up @@ -135,6 +149,7 @@ func (d *DynamicSampler) reloadConfigs() {
if configChanged {
newSampler := &dynsampler.AvgSampleRate{
GoalSampleRate: int(d.sampleRate),
ClearFrequencySec: int(d.clearFrequencySec),
}
newSampler.Start()

Expand Down

0 comments on commit 44700e2

Please sign in to comment.