Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure kafka client into backup #87

Merged
merged 3 commits into from
Jan 25, 2022

Conversation

mdedetrich
Copy link
Contributor

@mdedetrich mdedetrich commented Jan 20, 2022

About this change - What it does

This PR restructures some of the components in Guardian so that the dependencies are minimized. This is mainly a result of working on #83, as it turns there are configurations/components which are mandatory that shouldn't be, or are in the wrong module.

Regarding the cli feature, without this PR we run into the problem where we are forced to write configurations which dont make sense (i.e. compaction-bucket for backup) or in other cases we there is no configuration for options we do need to provide (i.e. group-id)

Why this way

The PR is split into 3 commits, each complaining the reasoning behind the change

  • Move KafkaClient into core-backup: KafkaClient is a KafkaConsumer that listens to a Kafka cluster, strictly speaking this functionality is only needed for backing up of data hence why it was moved to core-backup instead of just core. One of the reasons behind this change was the fact that certain Kafka configurations such as bootstrap-servers are global for every component but others such as group-id (aka consumer group) only exist for Kafka consumers and not producers (i.e. restore).
  • Add kafka-group-id config into Backup config: This adds group-id configuration as a core config into the backup-core Backup configuration. Interestingly Kafka configuration with alpakka is all over the place, some configurations such as bootstrap-servers have a default config in reference.conf that can be overridden, others such as group-id have no Typesafe configuration at all. Since there is no non-programmatic way to configure group-id we added it into our own config.
  • Remove compaction-bucket config from core-s3: This configuration was placed into the wrong project, it should rather belong in compaction-s3/restore-s3 since thats the only place where this bucket will be used. Note that only the S3 case class that is serialized from the typesafe config had the configuration option removed, the path for the compaction-bucket will still be contained within the s3-config path in reference.conf its just that the config will be placed in the reference.conf of other projects (and this will work fine since typesafe config naturally supports merging of configs).

@mdedetrich mdedetrich requested review from jlprat and reta January 20, 2022 06:54
@mdedetrich
Copy link
Contributor Author

Note that in the future when compaction is done, a further restructure will be needed since compaction actually doesn't require a kafka at all (it just moves around data in the configured storage using postgres to do the compaction).

However this is not necessary right now so we will burn that bridge when we get there.

@mdedetrich mdedetrich merged commit bd9b3b4 into main Jan 25, 2022
@mdedetrich mdedetrich deleted the restructure-kafka-client-into-backup branch January 25, 2022 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants