-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to UV #2282
Migrate to UV #2282
Conversation
refs 3877e6b
some of these are packages we don't use any more some of them were pins we put in requirements.txt to force a package we directly depend on to use a specific version of a transitive dependency (workaround for no lockfile) none of them are needed any more
mock has been part of the standard library for many years we no longer need this compatibility shim
By default the curl installer installs into $HOME/.cargo/bin this makes it a bit awakward to install as one user and then call it as another. You can work around this, but given uv has zero python dependencies, installing it into the global env with pip won't conflict with anything and saves some faffing about with paths and permissions. Installing it as root into the system env makes it available if we log in with either the ubuntu or every_election user account
.circleci/config.yml
Outdated
. .venv/bin/activate | ||
pip install -r requirements/cdk.txt | ||
uv sync --extra cdk --no-dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, uv sync --extra cdk --no-dev
here will install all the main packages as well as the cdk extras.
Nothing bad will happen. We're just installing more stuff than we really need to.
Upstream feature request: astral-sh/uv#8292
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: We should now be able to upgrade to UV 0.4.27 which adds the --only-group
flag and fix this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so there's a lot going on in c5b78be .
As of UV 0.4.27
https://github.com/astral-sh/uv/releases/tag/0.4.27
UV now has dependency-groups
: https://docs.astral.sh/uv/concepts/dependencies/#dependency-groups
This is a distinct feature from optional-dependencies
: https://docs.astral.sh/uv/concepts/dependencies/#optional-dependencies
..although there is obviously a lot of overlap between these two concepts.
I think if you're using uv as an application package manager, they basically just reduce to 2 overlapping ways to accomplish almost the same thing.
If you're using uv to build packages, optional-dependencies
allow you to define optional groups that are "public" to your user so you can pip install package[optional]
, whereas dependency-groups
are "private".
Switching from optional-dependencies
to dependency-groups
means we can now use the --only-group
flag to just install CDK dependencies, which is good. There's no equivilent of the --all-extras
flag though, so if we want everything we have to either do uv sync --group cdk --group production
or add everything to our default groups and exclude the ones we don't want when we don't want them. I think what I've done here is probably reasonable.
The dev-dependencies
feature that was current when I wrote this PR ~a week ago is now "legacy"
https://docs.astral.sh/uv/concepts/dependencies/#legacy-dev-dependencies
I think this is a pretty good microcosm of what we're getting involved in here. Both in terms of the rough and the smooth (or the ruff and the smooth, if you prefer :badpundog: ).
On the one hand, uv is moving quickly 🚀 A feature we wanted ended up in a release within a week or so and it was already something they were looking at before they triaged the issue.
On the other hand, uv is moving quickly 😱 A feature that was current a week ago is now considered legacy as of a patch release and there are several overlapping ways to do quite similar things.
I suspect this will all shake out by the time they're offering any sort of stability but I reckon there will be some more churn between now and then. I will also note that this whole thing is basically something poetry went through over a longer timeframe before they really got it right.
..anyhooo, in c5b78be I have
- said we need uv>=0.4.27
- converted our
optional-dependencies
anddev-dependencies
- updated the relevant
uv sync
commands
.circleci/config.yml
Outdated
@@ -85,6 +79,7 @@ jobs: | |||
name: Submit coverage | |||
command: | | |||
. .venv/bin/activate | |||
uv pip install coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered moving this to a package in our manifest, but I can't really be bothered. I'm not sure how to meaningfully review a bump, for example.
Lets just stick with installing the latest version in CI and deal with it if it breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be collapsed into one line with uv run --with coveralls coverage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well uv run --with coveralls coveralls
worked, but first I copied and pasted what you said and then spent ages being confused about why the coverage report wasn't submitting :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deployscripts/install_uv.sh
Outdated
#!/usr/bin/env bash | ||
set -xeE | ||
|
||
UV_CONSTRAINT=">=0.4.0,<0.5.0" | ||
|
||
if [ "$CI" = "true" ]; then | ||
pip install uv"$UV_CONSTRAINT" | ||
else | ||
sudo PIP_BREAK_SYSTEM_PACKAGES=1 pip install uv"$UV_CONSTRAINT" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of things to talk about in this file:
-
Based on my experience with ruff, I think we probably don't want to just always install the latest uv, at least at this stage. Pinning a constraint and accepting we're going to need to bump it is probably a better call while this is still zero-versioned and offers no stability guarantees IMO.
>=0.4.0,<0.5.0
seems pragmatic at this stage. Simultaneously, we'll probably want to bump it every so often. That's one of the costs we absorb of being early adopters. I abstracted this into this slightly clumsy script which I am calling in lots of places so that we've only got one place we have to bump the constraint. This is the reason why I'm not using the astral-sh/setup-uv action in.github/workflows/uv-lock.yml
. -
One of the problems I was trying to solve here is that on the EC2 instances I wanted to install uv in such a way that regardless of whether we
aws ec2-instance-connect ssh --os-user every_election
oraws ec2-instance-connect ssh --os-user ubuntu
, we've gotuv
installed and on the path. The simplest way to do that was smoosh it into the global python env with pip, rather than curl-bash it. There's more detail on this in the commit message for ea89bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented on commits as I went so missed some refactors that went on later. I've left these comments in case they're useful, but I think you can ignore some due to you having had the same thought as me :)
.circleci/config.yml
Outdated
@@ -23,7 +23,7 @@ jobs: | |||
node-version: '18' | |||
|
|||
- restore_cache: | |||
key: v3-dependencies-{{ checksum "requirements/base.txt" }}-{{ checksum "requirements/testing.txt" }}-{{ checksum "requirements/cdk.txt" }} | |||
key: v3-dependencies-{{ checksum "pyproject.toml" }}-{{ checksum "uv.lock" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting. I know we're pinning exact versions in pyproject.toml
now, but wouldn't the lock file be updated in this case too? What's your thinking in using both files for the cache key here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good question. I went back and forth on this in my own head.
My thinking on this was that because dependabot can't bump uv.lock
it will try to push commits that change pyproject.toml
but don't change the lockfile this might be sensible. Maybe it is irrelevant though. uv lock --locked
should ensure we just fail the build early doors on those commits anyway.
I've removed it in 46d6de1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the only other consideration here is the python version. At the moment we tend to pump the v[n]
prefix when we bump python versions, but including pyproject.toml
would mean we bust the cash a lot more often. Not a big deal, I think what you currently have is reasonable 👍
.circleci/config.yml
Outdated
@@ -35,22 +35,17 @@ jobs: | |||
name: Install app dependencies | |||
command: | | |||
sudo apt update && sudo apt install -y gdal-bin python3-gdal python3-dev | |||
pyenv local $(ls ~/.pyenv/versions/) | |||
python -m venv .venv | |||
curl -LsSf https://astral.sh/uv/0.4.23/install.sh | sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can set UV_INSTALL_DIR=$HOME/.local/bin/
before sh
(after the pipe) here, and that saves modifying the profile below.
Two more things here:
-
It would be nice to cache this to save a download each time. This is partly for speed of build, but also to reduce the amount of things that can break and stop our builds (although I know it redirects to GitHub at the moment, so it's not a major problem).
I think something like this might do it?
if [ ! -f "$HOME/.local/bin/uv" ]; then curl -LsSf https://astral.sh/uv/0.4.23/install.sh | UV_INSTALL_DIR=$HOME/.local/bin/uv sh fi
This would also need
save_cache
/restore_cache
for the binary. -
Can we parameterize the version at all? Thinking for projects with more than one step where we might want to have this block a few times. This might not be needed if we can cache the binary and restore it for each job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think so too, but I kept it in on the off-chance you'd not need some of the env vars / flags I pointed out 👍
.circleci/config.yml
Outdated
curl -LsSf https://astral.sh/uv/0.4.23/install.sh | sh | ||
echo 'export PATH=$HOME/.cargo/bin:$PATH' >> $BASH_ENV | ||
source $BASH_ENV | ||
uv venv --python 3.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
- This will download and install the python version if it's not found on the system. Neat locally, but not what we want in CI (easy to miss that UV is doing this for us!). Can you add
--no-python-downloads
or exportUV_PYTHON_DOWNLOADS=never
? - Do we actually need to specify the python version here, if we have it in
pyproject.toml
/.python-version
? And we're already using thecimg
for the same version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we actually have opposite objectives here.
My thinking here was: Previously we were doing all sorts of pyenv faffing to make sure we were using the right global python version:
EveryElection/.circleci/config.yml
Line 38 in 0decd2d
pyenv local $(ls ~/.pyenv/versions/) EveryElection/.circleci/config.yml
Line 114 in 0decd2d
pyenv local $(ls -1 /opt/circleci/.pyenv/versions | grep 3.12) EveryElection/.circleci/config.yml
Line 163 in 0decd2d
pyenv local $(ls -1 /opt/circleci/.pyenv/versions | grep 3.12) EveryElection/.circleci/config.yml
Line 199 in 0decd2d
pyenv local $(ls ~/.pyenv/versions/)
I did at one point work out what we were trying to do there, but I've forgotten again.
My thinking here is: Lets bin all that and invoke uv venv --python 3.12
. If uv can use a version of python that is already installed, cool. If it needs to go and download one, also fine. Either way, we're running the right version in our venv. That is one of the core value propositions of UV imo. Tbh we could even consider switching the base image for the build to one that doesn't have python on it at all.
I don't mind trying what you suggested, but also I'd like to understand what advantage that gives us, or what problem what I've suggested causes.
Happy to just talk about this one on a call or whatever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I basically want to ensure we don't drift between [Circle]CI image version and the Python version in the venv.
At the moment the build isn't going to fail if we use cimg/python3.8
and uv venv --python 3.12
, but it will download a python (maybe for every CI run, depending on caching). If someone doesn't know this they might see the image version and assume that's what we're running. When we move to Python3.13 we might just bump the image, etc, etc.
I would like to be in a position where we either only specify the python version in one place (uv
) or where we prevent uv
from installing a version of Python that's not installed.
The pyenv
faffing was due to Pyenv being too picky. If 3.12
is in python-version
but you have 3.12.4
installed, it complains. You can't (or couldn't at the time) specify 3.12.*
or anything like that. I don't think uv
has this "feature" so it's moot here.
.circleci/config.yml
Outdated
@@ -85,6 +79,7 @@ jobs: | |||
name: Submit coverage | |||
command: | | |||
. .venv/bin/activate | |||
uv pip install coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be collapsed into one line with uv run --with coveralls coverage
@@ -60,7 +55,7 @@ jobs: | |||
paths: | |||
- ./.venv | |||
- ./node_modules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding $HOME/.cache/uv
here will mean that uv
doesn't need to re-download packages between runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 9d49161
@@ -1,7 +1,7 @@ | |||
name: AddUser | |||
description: Adds a system user | |||
schemaVersion: 1.0 | |||
component_version: 0.0.4 | |||
component_version: 0.0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What changed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing.
When I pushed stuff to the dev AWS account to test it, the image builder threw errors saying there was already a different 0.0.4 of cdk_stacks/components/add_user.yml
and already a different 0.0.2 of cdk_stacks/components/instance_connect.yml
, so that's why I've bumped those two despite there being no other changes in those files.
I don't really know what a good pattern is for managing this stuff, but given I don't think there's really any downside to bumping a version even if nothing changed, keeping them aligned across AWS accounts seems sensible to me.
Right. I've pushed some commits and replied to some comments.. back to you |
Great, I've replied where I think it might be useful, but basically this is good to go. I can see we're going to need to stay on top of Would you mind writing a "how to even UV at DC" document, where you pull out a load of these decisions? I'd like a document we can use when working across DC that explains the state of play and how we're using it. Does that sound useful to you? |
710694c
to
165b0fd
Compare
Refs
The core point of this PR is to migrate us from pip to uv.
However, I have scope-creeped (scope-crept?) this a bit to cover several other related tasks which made sense to address at the same time. So in reality, the things happening in this PR are:
1. UV
Migrate from pip to UV, including related CI and deployment changes.
2. Audit packages
Instead of just blindly re-creating our existing pip setup, I've taken the opportunity to do a bit of an audit of our packages. This includes a bit of fiddling with the optional groups we have but more importantly I have removed a bunch of packages that we no longer use, no longer need or were there to ensure we install a compatible version of a second-order package (which is now better achieved by uv's resolver/lockfile).
As we go through this process on other repos I would propose we do the same. Writing a new manifest seems like a good opportunity to review what we're packages we are pulling in and why.
Most of the action on this is in 6d5d0d1 and 7ad181a
3. Dependabot
I've done a bit of a review of our dependabot config
Because dependabot doesn't support uv.lock, I've written a GitHub action which will run
uv lock
and push the lockfile on any PRs submitted by dependabot.I don't really know how this is going to work out yet.
I do know this will come with the following tradeoffs:
@dependabot rebase
won't work because every PR will contain additional commits not by dependabot. I'm hoping we can rely on@dependabot recreate
I don't really know how this is going to work out in practice - are there also other problems I haven't thought of?
I think my proposal for this is: I've intentionally left some packages out of date in this PR. I reckon we merge this, and then I'll try updating them with dependabot. We'll see if what I've done works, what the workflow looks like, where the pain points are, and find out what I haven't considered. Ideally I would quite like it if we could avoid having to migrate to renovate immediately on every repo where we adopt this. Renovate might solve several other problems for us, but being able to keep the scope of each PR to just "migrate from one package manager to another" to start with would be nice.
Final remarks:
Although one of the things we're doing here is doing this work for EE, this also hopefully lays down a template for how we're going to do it elsewhere. So it seems like a good place to bikeshed the details. Lets nail the blueprint here before we do it everywhere.
I've tried to break this down into sensible commits to review so you can see what I'm doing at each stage. There's a little bit of going round the houses on how to install UV where I did one thing and then another. In general, going commit by commit is probably helpful though.