Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add read_failures privilege for authorizing failure store access #119915

Open
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

gwbrown
Copy link
Contributor

@gwbrown gwbrown commented Jan 10, 2025

This commit adds the read_failures privilege and the logic supporting that privilege. The read_failures privilege enables read access to failure store indices owned by data streams named in the indices field of an indices privileges group, without implying read access to that data stream's "normal" backing indices. The read_failures privilege is not included in any other index privilege, including all.

This is a bit of a mismatch with the existing privilege model, which authorizes actions and indices orthogonally. As of this change, in order to fully authorize an action, both action name and requested indices must be considered. To these ends, this PR refactors and centralizes a fair amount of the index authorization logic, reducing duplication of code and work at runtime.

Non-read actions to failure store indices, such as management calls, are authorized the same as backing indices; authorization will be granted to manage failure store indices if the user has permission to manage the owning data stream. It is only data visibility that is gated behind the new permission.


Todos before merge:

  • Get remaining tests passing (see CI)
  • ReservedRolesStoreTests randomizes index abstraction types in a really weird way and causes deprecation warnings which might never happen in a real deployment, need to dig into this
  • More tests for failure store auth edge cases
  • Cleanup pass

This commit adds the `read_failures` privilege and the logic supporting
that privilege. The `read_failures` privilege enables read access to
failure store indices owned by data streams named in the `indices` field
of an indices privileges group, without implying `read` access to that
data stream's "normal" backing indices.

This is a bit of a mismatch with the existing privilege model, which
authorizes actions and indices orthogonally. As of this change, in order
to fully authorize an action, *both* action name and requested indices
must be considered.

Non-read actions to failure store indices, such as management calls,
are authorized the same as backing indices; authorization will be
granted to manage failure store indices if the user has permission to
manage the owning data stream. It is only data visibility that is gated
behind the new permission.
Copy link
Contributor

Documentation preview:

@gwbrown
Copy link
Contributor Author

gwbrown commented Jan 17, 2025

@jbaiera and @jakelandis this is still quite rough around the edges but I'd like especially Jake to at least take a look at the core changes; it's possible (though I think unlikely) that this is contrary to larger goals for this code and I'd like to sort that out early. I'm going to get another CI run on this but I think the remaining failure store test failures may indicate a failure in my logic so tomorrow is digging into that.

@gwbrown gwbrown marked this pull request as ready for review January 17, 2025 04:00
@gwbrown gwbrown requested a review from a team as a code owner January 17, 2025 04:00
@gwbrown gwbrown added :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC v8.18.0 labels Jan 17, 2025
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jan 17, 2025
@gwbrown gwbrown removed the needs:triage Requires assignment of a team area label label Jan 17, 2025
@gwbrown gwbrown requested review from jbaiera and jakelandis January 17, 2025 04:00
@elasticsearchmachine elasticsearchmachine added the Team:Security Meta label for security team label Jan 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@gwbrown gwbrown changed the title Add read_failures privilege for authorizing failure store Add read_failures privilege for authorizing failure store access Jan 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @gwbrown, I've created a changelog YAML for you.

@gwbrown gwbrown requested a review from a team as a code owner January 17, 2025 04:05
@jakelandis
Copy link
Contributor

As-is the code does not fully work to apply the new privilege, so it is pretty hard to review the code as-is. The existing code is already complex and tricky code and this adds that complexity and this change refactors signfigant portions of the authorization logic (which makes me nervous). There are still parts and reasons for which I don't fully understand however, I can comment on a few areas...

  • IIUC READ_FAILURES_AUTOMATON = patterns("internal:special/read_failures"); is intended to manipulate the action checks in buildIndexMatcherPredicateForAction such that the read_failures don't match the indices: actions ? This feel like it is creating a separate model for expressing access to indices and is kinda hacky. These types of one-off have a way of causing problems in the long term. I understand it is indeed different behavior, but this seems to run counter to much of the other existing design (internal/cluster/indices prefixes have useful sematics) and would prefer if we could have a model that held true to the model, but differs in implementation as needed.
  • Group (actually GroupChecker) is now more of a business class than a data object and not sure if this best place to house the logic relative the failure store. There is alot of new logic here and hard to know if this is all necessary I can't shake the thought that we could make minimal changes to the existing code such that we kept the existing code ignorant of the failure store (need to introduce code to keep it ignorant) and special case any failure store access concerns to new failure authorization step. It would like have a minor performance hit in that we would need to authz twice if failure store is requested and present.
  • I am not sure that the IndexAccessControl needs to be failure store aware. Ideally it is, but in practice it has limited usage and can get by without needing to use it, so if that is driving some decisions then maybe we should re-evaluate (I think we can get by without it, but would require a bit more thought)
  • We should also be explicit to prevent DLS/FLS for failure store access. Likely via FieldAndDocumentLevelSecurityRequestInterceptor
  • I think we need to consider RCS 2.0 to ensure that requests will not blow up if older cluster's see the new permission (likely just validation you can't add the new permission to the remote_indices permissions in the role until we are ready to support it).

Functionally:

"all" does not mean "all" anymore, instead you need to add "all + read_failures". This takes away from the simplicity of "all". Since you have to explicitly enable the failure store and change your query to include the values in your results, I think "read_failures" should be included in the all permission, perhaps with a note in the documentation to take this into consideration when enabling this. all is really an admin permission anyway (full CRUD permission for the index/data stream) and i assume that most normal users would get the read permission which does not by default include the read_failures (which is I agree with).

Do we need any special auditing ?

Finally, I wonder if we could short-circuit much of this complexity if we changed the way we express the permission. Perhaps a new boolean much like allow_restricted_indices, allow_failure_store and just mirror the implementation of restricted indices. We would need to resolving the concrete indices of the data stream and determine if concrete indices are from a failure store, but IIUC the strategy of marking those up to never match without the boolean could work without a fundamental change (I think)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC Team:Security Meta label for security team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants