-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add State & Event Versioning #43
Comments
Thinking about this briefly - I'm initially inclined to manage the complexity of multiple versions in-code rather than in-infrastructure. While CDK/CloudFormation makes infrastructure easier, it's still not precisely easy. Having multiple, similar copies of our AWS Resources (EventBridge Rules, Lambda Functions, etc) brings up issues like resource naming collisions, longer deployment times, more opportunity to hit account limits, more opportunity for transient AWS issues to break the deployment of a given resource. I feel more confident about our ability to have the Lambda code handle this gracefully than doing it at the AWS-Resource level. |
So does this mean having a version in the message instead of different event names? Sorry if I'm misunderstanding. |
My thinking lines up more with having the version in the message rather than different event names, as embedding it in the message means that (I suspect) we'll have less versioned AWS Resources we need to deal with. |
I thinking about this a bit more, this is more than just event shapes changing. In fact, I think that's probably the easier part of the problem - if we have automated scans to bring the mirror infrastructure up to date (see: #36), then it's fine if we lose events during a transition because they'll be backfilled in a minute or two. If we have the automated scans we might even say it's fine to lose some events during a transition and not bother versioning the events/event handlers themselves. In my mind, the bigger issue is the state we're storing in Parameter Store and its link to the CloudFormation stack templates. For that, it seems like we'll probably use some combination of versioning in our code and "transitional commits" that users can "pass through" by running a (hopefully) idempotent update of their existing resources ( Updated the issue description to encapsulate the larger problem. |
Actually, it seems like if we're willing to do at least one "transitional commit" at some point in the future, we can solve this problem when it actually becomes a problem, rather than tackling it preemptively. I'm not currently against a preemptive approach, just pointing out an additional option. |
Thinking ahead a bit - I think this task (#65) to encapsulate our CDK context in compound objects is effectively a pre-req for this, as that encapsulation will allow us to more easily version individual bundles of state/context. |
Thinking ahead even more - we know we have a scaling bottleneck with how we store our state in AWS Systems Manager Parameter Store. If we move to a more "serious" storage solution, how does that affect our approach to state versioning? The free tier ( Given our use-case of just having these items sit around most of the time without being used, and the small amount of data involved, there's much cheaper options. If we want to keep the same data format of loosely-structured JSON, then we could do something like AWS DocumentDB for an order of magnitude less, but we'd be paying for instances just sitting around most of the time [3]. Another option would be DynamoDB and just using it as a simple K/V store (e.g. dumping our JSON as string into a single column). It seems extremely unlikely we'd ever exceed the 400 kB size limit for a single item [4] and we're already using a K/V store for our state so the transition seems easy. While complex, it *seems* the on-demand pricing [5] will gives us the flexibility we need for our use-case (occasional burst of large numbers of writes/reads, nothing most of the time, relatively small amount of data overall). [1] https://docs.aws.amazon.com/general/latest/gr/ssm.html |
Actually, DDB seems like a clear winner here. We can keep the JSON format, keep the K/V paradigm, and the pricing seems VERY reasonable for our use-case [1]. In us-east-2:
Most of our cost would come from serving the continuous scans of the User VPCs for changes in infrastructure, but there's ways to optimize that. |
Would it make sense to separate the configuration vs state items we have in parameter store, and maybe either leave the configuration items in there or look at something like AWS App Config? |
Good question. When I talk about state, I'm referring to bits of data that are required for orchestration/enabling the parts of the solution to communicate with each other across time. Some parts of that state are also what I'd consider configuration, which I guess would be things used specifically at runtime of the capture/viewer processes, etc. An example would be the DNS Name of the OpenSearch Domain. We need to store it in a location that the orchestration bits of the code can access it during different control plane operations but eventually it's turned into a bit of configuration embedded in the Capture/Viewer Docker containers to enable them to do their thing. We'll probably want to "master" all our state in a real storage solution like DDB, and then create projections of it consumed by something like AWS App Config or stuck in config files placed in S3. I'm not aware of an argument to retain Parameter Store as a part of our solution other than it requires some work to move off of. If we're going to move off of it, and I think we definitely will want to do so, then it seems better to move sooner than later in order to create less user-pain in a migration. |
So I think we have similar definitions then. Configuration = stuff required by capture/viewer/OS setup/etc that the user can change or needs to directly influence, state = everything else
Ah App Config can't be the source of truth?
agree |
I think there's a difference between "can" and "should" in this instance. I'd say, that all state should be mastered in a real storage solution. If there is configuration that is NOT state, then it's fine for it to be mastered in AWS AppConfig. An example would be we have items A and B in our state and use them to compute item C which is configuration. It's fine to me if the only place C lives is in AWS AppConfig. In other words - keep a single source of base truth, but projections of it can live elsewhere as needed. Given the nature of our application, I'm not sure it's possible for some configuration item D to exist that isn't either also state in DDB or derived from some state in DDB. If we find such a case, I'm OK with having a discussion at that point. |
I guess it depends where you wanted to keep things that are Arkime only config, like for example Arkime Rules or the OIDC configuration. App Config seemed like it already has done the work of publishing changes, but maybe I misunderstand what it does. I don't think we should have 2 sources of truth for items, or having to keep them in sync. I'm just worried we are reimplementing parts of App Config, but maybe that is easy to do with ddb. My main concern is Arkime configuration, and keeping the viewer/capture processes updated, if easy to do with ddb, having everything there is good. |
I think we're on the same page, just focusing on different parts of the overall problem. I'm not proposing we create our own publication solution just so we can master things solely in DDB. Maybe a heuristic we can use is: "if something other than the capture/viewer container would ever need to pull the data, then it's state that should live in DDB". For the specific scenario of Arkime configuration, quite a bit of that is already state (such as the OpenSearch Domain, the ARN of the Secret Manager Secret storing its password, etc) that will be in DDB. I'm guessing we'll need to pull the previous OICD configuration during CLI operations. Do we want the CLI to read from both DDB and AppConfig in order to compute the next iteration of the configuration we store in AppConfig, then write that to AppConfig so it's available for the containers? To my mind, that seems less preferable than just storing everything in DDB, pulling everything from there, computing the new AppConfig version, then writing to AppConfig. The containers will just be pulling from AppConfig either way, but I think it makes the component responsibilities clearer (AppConfig is always downstream of DDB) and I'm not too worried about syncing since the data flow would always be one direction. Another benefit would be that a single process would never need to read from multiple sources places to do it's job. I think it also makes things easier to understand where to look for stuff as an operator/maintainer. I guess another way to phrase it is, I don't necessarily see data stored in AppConfig being a separate "copy" to be "synchronized" so much as a re-projection from DDB. |
The question I ask myself, if I hit control-c at the wrong time (or something else bad happens) will ddb and appconfig have different values. I'm fine with everything in ddb, as long as its easy to fetch the values in capture/viewer instances also. |
I think thinks will become clearer once I start looking more closely at AppConfig in the context of solving the runtime/dynamic configuration problem as part of the OICD work. |
Two obvious ways handle different versions of the same entity in a data store are:
Parameter Store imposes a 4kB limit for The benefit of storing all versions in the same entry is that you get them all without needing to specifically know to look for them. However, I think that's a benefit primarily in the case of Parameter Store. With DDB, you can include the version as sort key so that it's easy to get all versioned copies of the same entity even though they're separate entries. This operation can be efficiently performed with the Otherwise, (1) seems like the better option. With separate entries, you don't have to worry about things like two differently-versioned processes trying to write to the same data entry. [1] https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComponents.html#HowItWorks.CoreComponents.PrimaryKey |
Description
This task is to decide on a versioning strategy and implement it. Per convo in PR (#42), @awick said:
While this was originally focused on event shapes, this is also applicable to the format of the state currently stored AWS as well (both Parameter Store and CloudFormation).
Related Tasks
Acceptance Criteria
The text was updated successfully, but these errors were encountered: