Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gh app backfill task extra #413

Merged
merged 17 commits into from
May 2, 2024
Merged

Gh app backfill task extra #413

merged 17 commits into from
May 2, 2024

Conversation

adrian-codecov
Copy link
Contributor

@adrian-codecov adrian-codecov commented Apr 24, 2024

Last time this was ran the tasks ran out of memory. I'm wanting to do 2 things to help with this:

  1. add yield_per -> helps fetch smaller amount of data per iteration
  2. separate the two backfill scenarios into two respective tasks -> more control about what's ran and what isnt

Logic remains the same, mostly moving files and tests arounds, renaming files/tasks, etc

  • Created 2 new tasks, backfill_existing_gh_app_installations and backfill_owners_withou_gh_app. It separates the original task into 2 pieces.
    • Added yield_per() in both tasks
  • Removed the original backfill_gh_apps_task in favor of the two above
  • Reshuffled tests into their respective test files
  • Centralized shared logic into helpers/backfills.py file

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

new_repo_service_ids=new_repo_service_ids,
),
)
gh_app_installation.repository_service_ids = new_repo_service_ids
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this overwrite what might be already there? Or are you sure at this point that the repository_service_ids is an empty list (when it shouldn't be)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this won't overwrite the things we have, see here for example:

is_selection_all = maybe_set_installation_to_all_repos(
db_session=db_session,
. We first check if the selection is all, else get independent repos for each app. Same happens for the other task,
if not is_selection_all:
# Find and add all repos the gh app has access to

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is, for example, if some installation G has repository_service_ids={1200}. Then this logic would overwrite the current selection {1200}.

But it doesn't really matter cause you are listing all the repos the installation has access to, which would include 1200.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this was part of the initial decision here, to also overwrite anything existing in case those were stagnant. And as you said, if it's "all", then that would contemplate anything that was already in

helpers/backfills.py Show resolved Hide resolved
helpers/backfills.py Outdated Show resolved Hide resolved
tasks/backfill_existing_gh_app_installations.py Outdated Show resolved Hide resolved
tasks/backfill_existing_gh_app_installations.py Outdated Show resolved Hide resolved
@adrian-codecov
Copy link
Contributor Author

Worth noting there's a chance this could still error out due to memory, in which case I'd further add subtasks to each of these, but wanted to try this first

@codecov-notifications
Copy link

codecov-notifications bot commented Apr 25, 2024

Codecov Report

Attention: Patch coverage is 96.29630% with 8 lines in your changes are missing coverage. Please review.

✅ All tests successful. No failed tests found.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #413      +/-   ##
==========================================
+ Coverage   97.44%   97.48%   +0.03%     
==========================================
  Files         395      398       +3     
  Lines       33422    33379      -43     
==========================================
- Hits        32568    32539      -29     
+ Misses        854      840      -14     
Flag Coverage Δ
integration 97.48% <96.29%> (+0.03%) ⬆️
latest-uploader-overall 97.48% <96.29%> (+0.03%) ⬆️
unit 97.48% <96.29%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 94.84% <93.22%> (+0.09%) ⬆️
OutsideTasks 97.59% <100.00%> (+0.03%) ⬆️
Files Coverage Δ
celery_config.py 68.49% <100.00%> (+0.43%) ⬆️
helpers/backfills.py 100.00% <100.00%> (ø)
tasks/__init__.py 100.00% <100.00%> (ø)
...nit/test_backfill_existing_gh_app_installations.py 100.00% <100.00%> (ø)
...st_backfill_owners_without_gh_app_installations.py 100.00% <100.00%> (ø)
tasks/backfill_existing_gh_app_installations.py 90.69% <90.69%> (ø)
...ks/backfill_owners_without_gh_app_installations.py 90.69% <90.69%> (ø)

... and 13 files with indirect coverage changes

@codecov-qa
Copy link

codecov-qa bot commented Apr 25, 2024

Codecov Report

Attention: Patch coverage is 96.26168% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 97.44%. Comparing base (4b32438) to head (c8b0b4f).

✅ All tests successful. No failed tests found.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #413   +/-   ##
=======================================
  Coverage   97.44%   97.44%           
=======================================
  Files         395      398    +3     
  Lines       33430    33431    +1     
=======================================
+ Hits        32577    32578    +1     
  Misses        853      853           
Flag Coverage Δ
integration 97.44% <96.26%> (+<0.01%) ⬆️
latest-uploader-overall 97.44% <96.26%> (+<0.01%) ⬆️
unit 97.44% <96.26%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 94.76% <93.10%> (+<0.01%) ⬆️
OutsideTasks 97.56% <100.00%> (+<0.01%) ⬆️
Files Coverage Δ
celery_config.py 68.49% <100.00%> (+0.43%) ⬆️
helpers/backfills.py 100.00% <100.00%> (ø)
tasks/__init__.py 100.00% <100.00%> (ø)
...nit/test_backfill_existing_gh_app_installations.py 100.00% <100.00%> (ø)
...st_backfill_owners_without_gh_app_installations.py 100.00% <100.00%> (ø)
tasks/backfill_existing_gh_app_installations.py 90.47% <90.47%> (ø)
...ks/backfill_owners_without_gh_app_installations.py 90.47% <90.47%> (ø)

Copy link

codecov-public-qa bot commented Apr 25, 2024

Codecov Report

Attention: Patch coverage is 96.26168% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 97.44%. Comparing base (4b32438) to head (c8b0b4f).

✅ All tests successful. No failed tests found ☺️

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #413   +/-   ##
=======================================
  Coverage   97.44%   97.44%           
=======================================
  Files         395      398    +3     
  Lines       33430    33431    +1     
=======================================
+ Hits        32577    32578    +1     
  Misses        853      853           
Flag Coverage Δ
integration 97.44% <96.26%> (+<0.01%) ⬆️
latest-uploader-overall 97.44% <96.26%> (+<0.01%) ⬆️
unit 97.44% <96.26%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 94.76% <93.10%> (+<0.01%) ⬆️
OutsideTasks 97.56% <100.00%> (+<0.01%) ⬆️
Files Coverage Δ
celery_config.py 68.49% <100.00%> (+0.43%) ⬆️
helpers/backfills.py 100.00% <100.00%> (ø)
tasks/__init__.py 100.00% <100.00%> (ø)
...nit/test_backfill_existing_gh_app_installations.py 100.00% <100.00%> (ø)
...st_backfill_owners_without_gh_app_installations.py 100.00% <100.00%> (ø)
tasks/backfill_existing_gh_app_installations.py 90.47% <90.47%> (ø)
...ks/backfill_owners_without_gh_app_installations.py 90.47% <90.47%> (ø)

Copy link

codecov bot commented Apr 25, 2024

Codecov Report

Attention: Patch coverage is 96.26168% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 97.47%. Comparing base (4b32438) to head (c8b0b4f).

✅ All tests successful. No failed tests found.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #413   +/-   ##
=======================================
  Coverage   97.47%   97.47%           
=======================================
  Files         426      429    +3     
  Lines       34121    34122    +1     
=======================================
+ Hits        33259    33260    +1     
  Misses        862      862           
Flag Coverage Δ
integration 97.44% <96.26%> (+<0.01%) ⬆️
latest-uploader-overall 97.44% <96.26%> (+<0.01%) ⬆️
unit 97.44% <96.26%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 94.78% <93.10%> (+<0.01%) ⬆️
OutsideTasks 97.56% <100.00%> (+<0.01%) ⬆️
Files Coverage Δ
celery_config.py 68.49% <100.00%> (+0.43%) ⬆️
helpers/backfills.py 100.00% <100.00%> (ø)
tasks/__init__.py 100.00% <100.00%> (ø)
...nit/test_backfill_existing_gh_app_installations.py 100.00% <100.00%> (ø)
...st_backfill_owners_without_gh_app_installations.py 100.00% <100.00%> (ø)
tasks/backfill_existing_gh_app_installations.py 90.47% <90.47%> (ø)
...ks/backfill_owners_without_gh_app_installations.py 90.47% <90.47%> (ø)

This change has been scanned for critical changes. Learn more

Copy link
Contributor

@giovanni-guidini giovanni-guidini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a comment though

name=GITHUB_APP_INSTALLATION_DEFAULT_NAME,
)
db_session.add(gh_app_installation)
db_session.commit()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should wait to commit this change only after the repository selection is complete. Otherwise if there are issues with the repo selection we'd have a github installation with bad repo selection

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. If I get rid of this line, both subsequent methods commit there so that takes care of this

@adrian-codecov adrian-codecov disabled auto-merge May 2, 2024 04:48
@adrian-codecov adrian-codecov enabled auto-merge May 2, 2024 04:48
@adrian-codecov adrian-codecov added this pull request to the merge queue May 2, 2024
Merged via the queue into main with commit d4496ed May 2, 2024
22 of 25 checks passed
@adrian-codecov adrian-codecov deleted the gh-app-backfill-task-extra branch May 2, 2024 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants