-
Notifications
You must be signed in to change notification settings - Fork 41
checktime does not invalidate obsolete VRPs #81
Comments
Hi @lukastribus,
The choice was made to ensure continuity in Origin Validation rather than bouncing routes/VRPs (+ added complexity on RTR serial management in some cases). The attack scenario I took into account while writing the software: it would be to break the link between GoRTR and the validator and then create a hijack on a previously signed resource. In the optimal security model, with distributed GoRTR close to the network device, the RTR sessions are maintained and the previously valid data is still used until the validator comes back up, leaving time for an engineer to figure it out and "freezing the state". The JSON only contains VRPs valid at a point in time. We are assuming that one hour later will likely be the same. This is due to the whole asynchronicity of the framework. The resulting file does not include all the expiration corresponding to the certificates per VRP. An organization/user can control the generation of the JSON and its distribution. If the distribution process not correspond to the generation process (eg: not meant to be refreshed periodically), the timers/checks can be disabled inside GoRTR. A possibility would be to fail-close (GoRTR stops upon an invalid file) and let the network devices figure it out: eg: using their RTR cache. |
Yeah I can see how that design choice makes a lot of sense for Cloudflare (when you have multiple senior *nix people in different timezones available 24/7), but I'm concerned about the people setting up a validator and RTR server without proper monitoring and alerting, and then when a failure occurs updates stop and we have stale VRP's on the routers. This could then remain undiscovered, and lead to big problems on production networks. I like your fail-close proposal, that makes a lot of sense in my mind. I'm not sure I'm able to implement the changes myself, but I will definitely take a look and try. Thanks! |
@lspgn can you in more detail describe the (current default?) 'fail-safe' behavior? It is not clear to me how exactly it is fail safe. fail-close on invalid file makes sense to me, a daemon killing itself when the config file is corrupt is not weird |
@job I based my explanation off the definition of fail-safe. Causing minimal harm to other equipments in my opinion means that it prefers missing a recent update and keep everything running in the same conditions as before until repaired. |
So the current behavior is to keep waiting until a valid file can be fetched again, and until then keep the current state? (I guess we can call that fail-stale? :-) |
making it configurable will be a nice addition! |
rpki-client 7.1 emits a new per VRP attribute: Effort is underway to support this in stayrtr (a gortr fork): |
checktime
only verifies thevalid
timestamp in the json metadata when a json file is actually read.However existing stale VRP's are never invalided.
None of those events invalided the obsolete VRPs that gortr still caches:
valid
of the original file expiredI assume the same issues also applies to json files hosted on a HTTP server.
This is concerning for me because a number of realistic issues will lead to a situation where gortr keeps sending bogus VRP's to the routers.
I think that checking
valid
when reading the json is not enough, but the timestamp needs to be saved and if expired, needs to invalidate the entire cache.The text was updated successfully, but these errors were encountered: