Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow failure of the coverage build in Travis? #101

Closed
zware opened this issue May 28, 2017 · 9 comments
Closed

Disallow failure of the coverage build in Travis? #101

zware opened this issue May 28, 2017 · 9 comments

Comments

@zware
Copy link
Member

zware commented May 28, 2017

Since python/cpython#1775 was merged, the coverage build in Travis actually passes all tests. Should we remove the allow_failures key to make sure that it continues to do so? That would mean an extra ~15 minutes between commit and approval from Travis, but as it's already ~20 minutes, I don't think it's that big of a deal. In my experience, I'm not going to sit around waiting for 20 minutes anyway; I'll have moved on to something else long before then and probably won't check back until long after the build has finished.

@brettcannon
Copy link
Member

So you identified one worry of the extended build time. But I have another worry of coverage.py accidentally breaking us like Sphinx did and thus having spontaneous PR failures.

Now we could pin coverage.py to a known good version. If someone adds a feature to Python that completely breaks coverage.py we could then consider having that PR include turning back on failure allowance.

So I guess in the face of that my only worry is the time. Maybe if we can somehow speed up the coverage run? I haven't tried running it under -j to speed things up (and to possibly allow turning back on the skipped tests that take a while). Coverage has subprocess support so it should be doable to get coverage.py to work with test -j0 without actually editing regrtest itself.

@gpshead
Copy link
Member

gpshead commented May 28, 2017

I don't feel like a coverage run should be a change blocker. It is an advisory check.

Putting it in the critical path would slow down changes being made that would be unlikely to ever have an impact on coverage in the first place (documentation, C code, internals, minor bug fixes). People who have moved on to something else and check back (as is common) will still have the coverage report available if they believe it makes sense to look at.

@vstinner
Copy link
Member

vstinner commented Sep 5, 2017

I just proposed to drop the macOS job: https://bugs.python.org/issue31355

About the "coverage job", it tests different things:

  • It compiles CPython using GCC instead of Clang and so gets differerent warnings, this part is already useful
  • It runs the tests sequentially, which catchs bugs which are not seen when running tests in parallel

So I consider that it's useful.

The main issue is that failures on optional jobs (allowed failures) are not reported to GitHub PR, so most people (including me) will not notice them.

For example, the coverage job was broken since May 2016 on Python 3.6 and nobody noticed it...

@brettcannon
Copy link
Member

The real problem with the coverage job is simply the amount of time that it takes. While it would be great to surface when it breaks, I currently don't know how to do that without making it mandatory (which will delay PRs turning green) or turning back on codecov comments (which people hated because the coverage report currently isn't stable; tracking that indirectly in #38 ).

@vstinner
Copy link
Member

The real problem with the coverage job is simply the amount of time that it takes.

CI timing is not stable at all. But in my experience, it's not uncommon that Travis CI ends before AppVeyor, especially since @zooba added the VS 2017 build on AppVeyor.

@brettcannon
Copy link
Member

You're right that it's not stable, but the coverage run is always slower than the standard test run on Travis just from the simple fact that the former is serially run while the latter is parallelized. I tried to make the coverage run work with -j but I was unsuccessful.

@vstinner
Copy link
Member

A single sample of CI timings:

  • AppVeyor, VS 2015: "Total duration: 4 min 22 sec"
  • Travis CI, clang job: "Total duration: 11 min 38 sec"
  • Travis CI, GCC/Coverage job: "Total duration: 14 min 33 sec"

I don't understand how, but the GCC/Coverage (which runs tests sequentially) is not so much slower than the clang job (which uses -j4 and so runs 4 test runners in parallel). FYI os.cpu_count() returns 32 on Travis CI (containers).

@brettcannon
Copy link
Member

Interesting! Looking at https://travis-ci.org/python/cpython/builds it looks like the total time is still just shy of 40 minutes when doing a full run. It used to be the case that was because the coverage build started later than the other builds, but maybe that's no longer true?

As for the -j 4 bit, that's because Travis claims there are only 2 cores (so 4 hyperthreaded cores) each: https://docs.travis-ci.com/user/reference/overview/ . Maybe their container setup has shifted? Could always try removing the restriction in a PR and see what happens in terms of time.

@nightlark
Copy link

Is this issue still relevant? As of python/cpython#25679 it looks like Travis no longer runs coverage builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants