Add `read_failures` privilege for authorizing failure store access #119915

gwbrown · 2025-01-10T04:49:29Z

This commit adds the read_failures privilege and the logic supporting that privilege. The read_failures privilege enables read access to failure store indices owned by data streams named in the indices field of an indices privileges group, without implying read access to that data stream's "normal" backing indices. The read_failures privilege is not included in any other index privilege, including all.

This is a bit of a mismatch with the existing privilege model, which authorizes actions and indices orthogonally. As of this change, in order to fully authorize an action, both action name and requested indices must be considered. To these ends, this PR refactors and centralizes a fair amount of the index authorization logic, reducing duplication of code and work at runtime.

Non-read actions to failure store indices, such as management calls, are authorized the same as backing indices; authorization will be granted to manage failure store indices if the user has permission to manage the owning data stream. It is only data visibility that is gated behind the new permission.

Todos before merge:

Get remaining tests passing (see CI)
ReservedRolesStoreTests randomizes index abstraction types in a really weird way and causes deprecation warnings which might never happen in a real deployment, need to dig into this
More tests for failure store auth edge cases
Cleanup pass

This commit adds the `read_failures` privilege and the logic supporting that privilege. The `read_failures` privilege enables read access to failure store indices owned by data streams named in the `indices` field of an indices privileges group, without implying `read` access to that data stream's "normal" backing indices. This is a bit of a mismatch with the existing privilege model, which authorizes actions and indices orthogonally. As of this change, in order to fully authorize an action, *both* action name and requested indices must be considered. Non-read actions to failure store indices, such as management calls, are authorized the same as backing indices; authorization will be granted to manage failure store indices if the user has permission to manage the owning data stream. It is only data visibility that is gated behind the new permission.

github-actions · 2025-01-10T04:49:40Z

Documentation preview:

✨ Changed pages

…nto failure-store-authentication

…ication

…his test

…ication

gwbrown · 2025-01-17T04:00:13Z

@jbaiera and @jakelandis this is still quite rough around the edges but I'd like especially Jake to at least take a look at the core changes; it's possible (though I think unlikely) that this is contrary to larger goals for this code and I'd like to sort that out early. I'm going to get another CI run on this but I think the remaining failure store test failures may indicate a failure in my logic so tomorrow is digging into that.

elasticsearchmachine · 2025-01-17T04:01:07Z

Pinging @elastic/es-security (Team:Security)

elasticsearchmachine · 2025-01-17T04:01:59Z

Hi @gwbrown, I've created a changelog YAML for you.

…nto failure-store-authentication

jakelandis · 2025-01-18T01:05:51Z

As-is the code does not fully work to apply the new privilege, so it is pretty hard to review the code as-is. The existing code is already complex and tricky code and this adds that complexity and this change refactors signfigant portions of the authorization logic (which makes me nervous). There are still parts and reasons for which I don't fully understand however, I can comment on a few areas...

IIUC READ_FAILURES_AUTOMATON = patterns("internal:special/read_failures"); is intended to manipulate the action checks in buildIndexMatcherPredicateForAction such that the read_failures don't match the indices: actions ? This feel like it is creating a separate model for expressing access to indices and is kinda hacky. These types of one-off have a way of causing problems in the long term. I understand it is indeed different behavior, but this seems to run counter to much of the other existing design (internal/cluster/indices prefixes have useful sematics) and would prefer if we could have a model that held true to the model, but differs in implementation as needed.
Group (actually GroupChecker) is now more of a business class than a data object and not sure if this best place to house the logic relative the failure store. There is alot of new logic here and hard to know if this is all necessary I can't shake the thought that we could make minimal changes to the existing code such that we kept the existing code ignorant of the failure store (need to introduce code to keep it ignorant) and special case any failure store access concerns to new failure authorization step. It would like have a minor performance hit in that we would need to authz twice if failure store is requested and present.
I am not sure that the IndexAccessControl needs to be failure store aware. Ideally it is, but in practice it has limited usage and can get by without needing to use it, so if that is driving some decisions then maybe we should re-evaluate (I think we can get by without it, but would require a bit more thought)
We should also be explicit to prevent DLS/FLS for failure store access. Likely via FieldAndDocumentLevelSecurityRequestInterceptor
I think we need to consider RCS 2.0 to ensure that requests will not blow up if older cluster's see the new permission (likely just validation you can't add the new permission to the remote_indices permissions in the role until we are ready to support it).

Functionally:

"all" does not mean "all" anymore, instead you need to add "all + read_failures". This takes away from the simplicity of "all". Since you have to explicitly enable the failure store and change your query to include the values in your results, I think "read_failures" should be included in the all permission, perhaps with a note in the documentation to take this into consideration when enabling this. all is really an admin permission anyway (full CRUD permission for the index/data stream) and i assume that most normal users would get the read permission which does not by default include the read_failures (which is I agree with).

Do we need any special auditing ?

Finally, I wonder if we could short-circuit much of this complexity if we changed the way we express the permission. Perhaps a new boolean much like allow_restricted_indices, allow_failure_store and just mirror the implementation of restricted indices. We would need to resolving the concrete indices of the data stream and determine if concrete indices are from a failure store, but IIUC the strategy of marking those up to never match without the boolean could work without a fundamental change (I think)?

…ication

elasticsearchmachine added the v9.0.0 label Jan 10, 2025

elasticsearchmachine and others added 6 commits January 10, 2025 04:55

[CI] Auto commit changes from spotless

3ac44e9

Finish lining up the two kinds of predicate

47f150e

Fixup the logging a bit + test fixes

1982bf1

Merge remote-tracking branch 'gwbrown/failure-store-authentication' i…

0afd915

…nto failure-store-authentication

Merge remote-tracking branch 'origin/main' into failure-store-authent…

1ae86a2

…ication

more fixes + test fixes

a280d50

gwbrown mentioned this pull request Jan 11, 2025

Add read_failures privilege to authorize reading the failure store #116860

Closed

gwbrown added 18 commits January 10, 2025 21:33

Move toward test fixes

47c6e74

Don't create unnecessary predicates

a637934

Extract IsResourceAuthorizedPredicate to own file

eb9add0

Extract Group to independent class

4f2c06e

Rework IsResourceAuthorizedPredicate to do the logic directly

c7ab634

Selectively authorize through data streams

7aaad4f

Tweaks to get things back in line with unit tests

0e60d07

Merge remote-tracking branch 'origin/main' into failure-store-authent…

1410b1a

…ication

Looks like I accidentally fixed some NPEs that were load bearing in t…

5915f46

…his test

Order changed here

c189dbb

Get the test passing, with a todo

73eab0b

A few more tweaks per test findings

0ffd592

Merge remote-tracking branch 'origin/main' into failure-store-authent…

b241864

…ication

Reinstate put mapping alias hack in resolver

e6db465

Clean up some unused methods

88cca45

Merge remote-tracking branch 'origin/main' into failure-store-authent…

35a0d4c

…ication

imports

d59d0f9

ok but do the check right this time

762a84f

gwbrown mentioned this pull request Jan 16, 2025

[Failure store] Add failure index retrieval to IndexAbstraction #119413

Merged

More fixes to issues raised in tests and fixes to tests

da9a00a

gwbrown added 6 commits January 16, 2025 15:44

Merge remote-tracking branch 'origin/main' into failure-store-authent…

cd934b6

…ication

Post-merge fixup

583b0c7

Plaster over warnings in ReservedRolesStoreTests temporarily

a8a50ec

Add read_failures to default REST test role

487d17a

Add read_failures to default REST test role, in the other place

64eb116

Merge remote-tracking branch 'origin/main' into failure-store-authent…

a5a3923

…ication

gwbrown marked this pull request as ready for review January 17, 2025 04:00

gwbrown requested a review from a team as a code owner January 17, 2025 04:00

gwbrown added :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC v8.18.0 labels Jan 17, 2025

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jan 17, 2025

gwbrown removed the needs:triage Requires assignment of a team area label label Jan 17, 2025

gwbrown requested review from jbaiera and jakelandis January 17, 2025 04:00

elasticsearchmachine added the Team:Security Meta label for security team label Jan 17, 2025

gwbrown added the >enhancement label Jan 17, 2025

gwbrown changed the title ~~Add read_failures privilege for authorizing failure store~~ Add read_failures privilege for authorizing failure store access Jan 17, 2025

Update docs/changelog/119915.yaml

441c93a

gwbrown added 2 commits January 16, 2025 21:03

spotless

86c115d

Merge remote-tracking branch 'gwbrown/failure-store-authentication' i…

2aa8c14

…nto failure-store-authentication

gwbrown requested a review from a team as a code owner January 17, 2025 04:05

gwbrown added 2 commits January 19, 2025 18:15

Fixes + tests for fixes for core failure store authentication

a9347c9

Merge remote-tracking branch 'origin/main' into failure-store-authent…

f3cdc26

…ication

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `read_failures` privilege for authorizing failure store access #119915

Add `read_failures` privilege for authorizing failure store access #119915

gwbrown commented Jan 10, 2025 •

edited

Loading

github-actions bot commented Jan 10, 2025

gwbrown commented Jan 17, 2025

elasticsearchmachine commented Jan 17, 2025

elasticsearchmachine commented Jan 17, 2025

jakelandis commented Jan 18, 2025

Add read_failures privilege for authorizing failure store access #119915

Are you sure you want to change the base?

Add read_failures privilege for authorizing failure store access #119915

Conversation

gwbrown commented Jan 10, 2025 • edited Loading

github-actions bot commented Jan 10, 2025

gwbrown commented Jan 17, 2025

elasticsearchmachine commented Jan 17, 2025

elasticsearchmachine commented Jan 17, 2025

jakelandis commented Jan 18, 2025

Add `read_failures` privilege for authorizing failure store access #119915

Add `read_failures` privilege for authorizing failure store access #119915

gwbrown commented Jan 10, 2025 •

edited

Loading