Remove polling loop for job finishing event processing #15811

AlanCoding · 2025-02-04T13:59:21Z

SUMMARY

From the same conversation that led to #15805

This tries to remove the one largest remaining liability I can think of with this feature. This had a polling loop that waiting for events to finish processing. It had a timeout, but still...

I can easily think of a situation where the callback receiver is down, or otherwise completely overwhelmed. That means that almost all jobs go into this polling loop, and that could result in saturating max workers, which leads to generalized failure.

It's somewhat unclear how often the post_run_hook will work effectively to process events, but from the dev environment:

tools_awx_1       | 2025-02-04 13:41:00,291 INFO     [-] awx.analytics.job_lifecycle job-6 finished processing 12 events, running save indirect host entries {"type": "job", "task_id": 6, "state": "finished processing 12 events, running save_indirect_host_entries", "work_unit_id": "awx12G4kJ7L9", "task_name": "test_indirect_host_counting JT: run_task.yml"}

It does seem to go directly into processing events.

ISSUE TYPE

Bug, Docs Fix or other nominal change

COMPONENT NAME

API

codecov · 2025-02-04T14:17:41Z

Codecov Report

Attention: Patch coverage is 62.50000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 75.25%. Comparing base (b2d887b) to head (9fa6579).
Report is 6 commits behind head on feature_indirect-host-counting.

✅ All tests successful. No failed tests found.

❌ Your project check has failed because the head coverage (75.25%) is below the target coverage (100.00%). You can increase the head coverage or adjust the target coverage.

AlanCoding · 2025-02-04T15:24:23Z

The test failed for what seems like a very surprising reason. I'm guessing that it has to do with me adding additional gating before calling the task in events_processed_hook.

Consulting the logs, it did hit our logic inside of that for several jobs (ids 2, 3, and 12). For those jobs, it seems to have worked correctly.

So my theory is that those jobs hit the logic via the job control task, and that our particular test hit it via the callback receiver. If using the callback receiver, it's very possible that you're operating with old model data, so I pushed a commit to hopefully fix that.

sonarqubecloud · 2025-02-04T15:27:00Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

AlanCoding · 2025-02-04T19:06:59Z

Now I'm thinking the problem might be related to #15780

really, we should be able to do anything about what we're doing without that.

Remove polling loop for job finishing event processing

2ff6cb7

github-actions bot added the component:api label Feb 4, 2025

AlanCoding force-pushed the cant_wait branch from 62e4e32 to 2ff6cb7 Compare February 4, 2025 13:59

linter

47af2f4

pb82 approved these changes Feb 4, 2025

View reviewed changes

Address race condition

5809b9a

make this more efficient

9fa6579

AlanCoding mentioned this pull request Feb 6, 2025

Feature indirect host counting #15802

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove polling loop for job finishing event processing #15811

Remove polling loop for job finishing event processing #15811

AlanCoding commented Feb 4, 2025

codecov bot commented Feb 4, 2025 •

edited

Loading

AlanCoding commented Feb 4, 2025

sonarqubecloud bot commented Feb 4, 2025

AlanCoding commented Feb 4, 2025

Remove polling loop for job finishing event processing #15811

Are you sure you want to change the base?

Remove polling loop for job finishing event processing #15811

Conversation

AlanCoding commented Feb 4, 2025

SUMMARY

ISSUE TYPE

COMPONENT NAME

codecov bot commented Feb 4, 2025 • edited Loading

Codecov Report

AlanCoding commented Feb 4, 2025

sonarqubecloud bot commented Feb 4, 2025

Quality Gate passed

AlanCoding commented Feb 4, 2025

codecov bot commented Feb 4, 2025 •

edited

Loading