Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Successfully test collection of event_query.yml data #15761

Merged
merged 2 commits into from
Jan 24, 2025

Conversation

AlanCoding
Copy link
Member

@AlanCoding AlanCoding commented Jan 21, 2025

SUMMARY

This accomplishes passing the test:

py.test tests/projects/test_indirect_host_counting.py

The scaffolding of the test was mostly set up in prior merge #15754 and this fills in some of the TODO there. Basically, this ships the data in the collection event_query.yml data from ansible-runner back to AWX. Qualifier: this uses internal ansible-core methods, and this is ultimately not acceptable, so we'll table that.

After this, what is needed:

  • Put this behind the feature flag Establish a feature flag for indirect host counting feature #15759, or maybe only the event_query part. Needs discussion.
  • Actually process the data via a system task
    • Add new table with all the fields prescribed in the proposal
    • Add a solo model to track last run time
    • Add a new periodic system task (every 1 hour) that gets all job since the last-run time, and from that job, loads the installed_collections, and from that, gets all collection event_query.yml data. Then it loops over all job event data and applies JQ logic to get the data to save into the new model, and bulk inserts it.
  • Set up collectors from the new model

I'm still a little concerned about performance on a number of items here.

This is somewhat competitive with @pb82's draft #15750, and it is derived from @chrismeyersfsu still-earlier PoC https://github.com/chrismeyersfsu/indirect-instance-count. But importantly, and strategically, it merges the solution to this problem with the solution to ansible/ansible-runner#1273

This PR also willfully disregards the decision to put this logic in ansible-runner's (stdout) callback plugin. There are 2 reasons for that. Firstly, this is way way easier to fully test right now, as it plugs into the file:// based testing system in the live tests, so we can go full cycle. Secondly, there may be resistance to making this an ansible-runner or ansible-core "thing" and rightfully deserves to live on the AWX software layer, which is an argument I am re-stating from @sivel. It's not a bad argument, and my intent with this PoC is to clearly show an interface where the AWX task system sets up collection, and then receives collected data in the artifacts folder. It's clear what we're collecting, and we can adhere to whatever the updated contract will be.

ISSUE TYPE
  • New or Enhanced Feature
COMPONENT NAME
  • API

@@ -948,6 +945,10 @@ def build_env(self, job, private_data_dir, private_data_files=None):
paths = [os.path.join(CONTAINER_ROOT, folder)] + paths
env[env_key] = os.pathsep.join(paths)

env['ANSIBLE_CALLBACKS_ENABLED'] = 'indirect_instance_count'
if 'callbacks_enabled' in config_values:
env['ANSIBLE_CALLBACKS_ENABLED'] += ':' + config_values['callbacks_enabled']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jessicamack around here, in particular, we would like to check the feature flag to tell if the indirect host counting is enabled. How do I do that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with open(data_file_path) as f:
collected_data = json.loads(f.read())

instance.installed_collections = collected_data['installed_collections']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could throw a ValueError if the plugin returns invalid or out-of-spec json.


if os.path.exists(collections_info):
with open(collections_info) as ee_json_info:
ee_collections_info = json.loads(ee_json_info.read())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this mechanism already exists? Or was it unused so far?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It existed, but was broken with a particular ansible-runner version.

The design problem is that we can't honestly collect the event_query data without also indirectly collecting the collection list (which is what this did, but is broken). So it would be awfully neglectful to throw in the new loading logic right next to the old, broken, artifact file loading which is the same in form.

We still have a software coordination problem, which is that supposed there is a shared problem which will lead to ansible-runner developing a new solution, and that the collection listing this uses is a non-public API... but categorically, if there is no public API for the collection list, this sub-collection detail of the event_query can't be gathered in a kosher manner.

Let me jump to the conclusion - we finish up this approach and call it the "beta" version. It will be fully functional, but use non-public APIs. Then, we will have a TODO when the public API (new callback methods) are delivered from ansible-runner. We may need to have an import test / fallback to switch between the "beta" behavior and the accepted solution.

def v2_playbook_on_stats(self, stats):
artifact_dir = os.getenv('AWX_ISOLATED_DATA_DIR')
if not artifact_dir:
raise RuntimeError('Only suitable in AWX, did not find private_data_dir')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does Only suitable in AWX mean here? Wouldn't this plugin be only suitable for AAP instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're writing a plugin that is usable generally in an ansible-playbook run. This error should be seen if someone ever does that (enable the plugin, use in ansible-playbook) outside of a AWX/controller job.

@AlanCoding AlanCoding changed the base branch from devel to indirect-host-counting January 23, 2025 16:40
from ansible.cli.galaxy import with_collection_artifacts_manager
from ansible.release import __version__

from ansible.galaxy.collection import find_existing_collections
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: will have to wrap this in a try-except, not currently a public API. Do simple warning of some type if it doesn't work.

@AlanCoding AlanCoding changed the title [PoC][DO NOT MERGE] Successfully test collection of event_query.yml data Successfully test collection of event_query.yml data Jan 23, 2025
@AlanCoding AlanCoding marked this pull request as ready for review January 24, 2025 15:32
Get tests passing

Mild rebranding

Put behind feature flag, flip true in dev

Add noqa flag
@AlanCoding AlanCoding merged commit 6fc3461 into ansible:indirect-host-counting Jan 24, 2025
10 of 20 checks passed
pb82 pushed a commit that referenced this pull request Jan 27, 2025
* Callback plugin method from cmeyers adapted to global collection list

Get tests passing

Mild rebranding

Put behind feature flag, flip true in dev

Add noqa flag

* Add missing wait_for_events
AlanCoding added a commit to AlanCoding/awx that referenced this pull request Jan 29, 2025
* Callback plugin method from cmeyers adapted to global collection list

Get tests passing

Mild rebranding

Put behind feature flag, flip true in dev

Add noqa flag

* Add missing wait_for_events
pb82 pushed a commit that referenced this pull request Jan 29, 2025
* Callback plugin method from cmeyers adapted to global collection list

Get tests passing

Mild rebranding

Put behind feature flag, flip true in dev

Add noqa flag

* Add missing wait_for_events
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants