Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter][batching] configuration and config validation for bytes based batching #12154

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

sfc-gh-sili
Copy link
Contributor

@sfc-gh-sili sfc-gh-sili commented Jan 22, 2025

Description

This PR adds config API that will be used for serialized bytes based batching.

We will deprecate MinSizeConfig and MaxSizeConfig in favor of:

type SizeConfig struct {
	Sizer string `mapstructure:"sizer"`
	MinSize int `mapstructure:"mix_size"`
	MaxSize int `mapstructure:"max_size"`
}

Link to tracking issue

#3262
#12303

Testing

Documentation

@sfc-gh-sili sfc-gh-sili requested a review from a team as a code owner January 22, 2025 03:23
@sfc-gh-sili sfc-gh-sili requested a review from mx-psi January 22, 2025 03:23
Copy link

codecov bot commented Jan 22, 2025

Codecov Report

Attention: Patch coverage is 68.42105% with 12 lines in your changes missing coverage. Please review.

Project coverage is 91.35%. Comparing base (83d93cd) to head (81d8227).

Files with missing lines Patch % Lines
exporter/exporterbatcher/sizer_type.go 66.66% 7 Missing and 1 partial ⚠️
exporter/exporterbatcher/config.go 71.42% 3 Missing and 1 partial ⚠️

❌ Your patch check has failed because the patch coverage (68.42%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #12154      +/-   ##
==========================================
- Coverage   91.39%   91.35%   -0.04%     
==========================================
  Files         468      469       +1     
  Lines       25598    25636      +38     
==========================================
+ Hits        23395    23421      +26     
- Misses       1787     1797      +10     
- Partials      416      418       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 33 to 34
MinSizeItems int `mapstructure:"min_size_items"`
MinSizeBytes int `mapstructure:"min_size_bytes"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have a "Sizer"("SizerType") as enum with 3 values (request, items, bytes) and the "size" value instead?

Copy link
Contributor Author

@sfc-gh-sili sfc-gh-sili Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. How about something like this:

type BatcherConfig struct {
	...
	minSize SizeConfig `mapstructure:"min_size"`
	maxSize SizeConfig `mapstructure:"max_size"`
}
type SizeConfig struct {
	sizer string `mapstructure:"sizer"`
	size int `mapstructure:"size"`
}
func (c SizeConfig) validate() {
	...
}

Does the option request mean that the request will implement its own sizing method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we need "sizer" in both min/max, since I don't think we need to accept different sizer for min and max.

Copy link
Contributor Author

@sfc-gh-sili sfc-gh-sili Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to accept different sizer for min and max.

Good point. How about

type BatcherConfig struct {
	...
	sizer SizeType `mapstructure:",squash"`
	minSize int `mapstructure:"max_size"`
	maxSize int `mapstructure:"max_size"`
}
type SizerType struct {
	sizer string `mapstructure:"sizer"`
}
func (c SizerType) validate() {
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is "Sizer" a configuration string? What does it name, and how does the user configure it?

Copy link
Contributor Author

@sfc-gh-sili sfc-gh-sili Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of something that encapsulates better. How about

type SizeConfig struct {
	Sizer string `mapstructure:"sizer"`
	MinSize int `mapstructure:"mix_size"`
	MaxSize int `mapstructure:"max_size"`
}

That way we can validate SizeConfig pass it around as a whole.

Copy link
Member

@bogdandrutu bogdandrutu Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmacd we are considering the options for the user Sizer -> "requests"|"items"|"bytes".

Copy link
Member

@bogdandrutu bogdandrutu Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfc-gh-sili still need an enum with possible values for Sizer because usages of the SizeConfig need to do different things based on different types of sizers.

PS: I like the SizeConfig, but would still keep the SizerType :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu Sounds good! I updated the config to have BatchConfig -> SizeConfig -> SizerType. Would you mind taking another look?

Comment on lines 80 to 83
err := c.SizeConfig.Validate()
if err != nil {
return err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to call Validate, it is called automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Thanks!

MinSize: 100,
},
}
require.EqualError(t, cfg.Validate(), "sizer should either be bytes or items")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test that it is called automatically, use component.ValidateConfig()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

exporter/exporterbatcher/config.go Outdated Show resolved Hide resolved

type SizerType struct {
// Sizer should either be bytes or items.
Sizer string `mapstructure:"sizer"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this private and create an UnmarshalText (implement this interface https://pkg.go.dev/encoding#TextUnmarshaler). See https://github.com/open-telemetry/opentelemetry-collector/blob/main/pipeline/internal/globalsignal/signal.go

That way, we can control to only allow specific values.

Comment on lines 42 to 43
const SizerTypeItems = "items"
const SizerTypeBytes = "bytes"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they should be SizerType type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

Comment on lines 99 to 101
if c.Sizer != SizerTypeItems && c.Sizer != SizerTypeBytes {
return errors.New("sizer should either be bytes or items")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this if you do what I suggest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I replaced Validate() with TextUnmarshal and updated the tests accordingly. Let me know if there is further issues

@sfc-gh-sili
Copy link
Contributor Author

I noticed an existing issue that discusses the config API for queue #9462 and I think we should align batch config API and queue config API as possible.

Opened an issue #12303 for further discussion.

@sfc-gh-sili sfc-gh-sili force-pushed the sili-config-for-serialized-bytes-based-batching branch from 357d13c to 1fe0fe6 Compare February 8, 2025 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants