-
Notifications
You must be signed in to change notification settings - Fork 869
GitSubmodules
The Open MPI project started using Git submodules in early 2020.
This allows us to import code from remote Git repositories without carrying it directly in our own repository. There are a number of benefits to this; Google around for expositions on why submodules are Good Things.
There are generally two ways we use submodules in Open MPI:
- To track a specific commit (e.g., a tag) in a remote repository.
- Such tags generally refer to a specific release.
- E.g., the
hwloc-2.0.1
tag in the hwloc repository.
- To track a specific branch in a remote repository.
- This allows us to keep up with development of a remote project.
The first use -- tracking a specific commit (usually a tag) -- is more common, because we tend to want stability when importing remote projects.
When using Git submodules, there are a few differences to "traditional" (i.e., not-submodule-related) Git usage. It's easiest to talk about them in terms of specific use cases:
- Initial clone of the Open MPI repository
- Git updating the Open MPI repository
- Helpful day-to-day tips
- Adding a new submodule pointing to a specific commit
- Updating the commit that a submodule refers to
- Updating along a branch that a submodule refers to
Now that we use submodules, it is not sufficient to simply git clone <OPEN_MPI_GIT_URL_REPO>
.
Instead, you must add --recursive
into your clone command:
$ git clone --recursive [email protected]:open-mpi/ompi.git
# or
$ git clone --recursive https://github.com/open-mpi/ompi.git
If you already cloned the Open MPI repository and didn't use --recursive
, you can initialze / download all submodules thusly:
$ git submodule update --init --recursive
NOTE: This will update all of Open MPI's submodules to whatever is current upstream. If you have local changes to a submodule that you do not want to destroy, do not use this.
If you are switching back and forth between the 5.0.x and main branches, make sure to resync your submodules:
$git submodule sync
Note that git pull ...
is no longer sufficient to update your entire Open MPI tree. This will still update everything inside the Open MPI repository, but it will not (by default) update any changes to submodules.
You have a few options:
- Use
--recurse-submodules
:
$ git pull --recurse-submodules
- Use
submodule --update
:
$ git submodule update --init --recursive
NOTE: This will update all of Open MPI's submodules to whatever is current upstream. If you have local changes to a submodule that you do not want to destroy, do not use this.
You may find it useful to set the following two Git config variables:
$ git config --global diff.submodule log
$ git config --global status.submodulesummary 1
These will show you a bit more status about submodule status in git log
and git status
outputs, respectively.
Additionally, if you're lazy (like me), you may wish to make some Git aliases to include "submodule" variants of commands, such as:
$ head -n 3 $HOME/.gitconfig
[alias]
prrs = pull --rebase --recurse-submodules
cloner = clone --recursive
If you need to track a new Git submodule, keep in mind the following:
- If the submodule is part of an MCA component, make the submodule be
a subdirectory in the component (e.g., see the current hwloc
component in the
opal/mca/hwloc/hwloc2
tree). - Use a public, widely-available URL for the target Git repository.
- For example, for Github repositories, use the HTTPS version of the URL (not the SSH form -- because not everyone has ssh keys setup on Github).
Here's the commands to run. The latter half are almost identical to the steps you follow when updating the commit that a submodule points to:
$ cd PATH_TO_OPEN_MPI_GIT_REPOSITORY
# Cd to the directory where the submodule will live
$ cd opal/mca/foo/bar50x
# Add the submodule, giving it a reasonable name
$ git submodule add --name bar-50x \
https://github.com/open-mpi/bar.git
# Then check out the specific desired commit (e.g., tag)
$ git checkout v5.0.3
# Now cd back into the main Open MPI repository
$ cd ..
# Make a branch in the Open MPI repo
# (because this will turn into a pull request)
$ git checkout -b pr/add-submodule-for-foo-bar-v5.0.3
# Git add the "bar" dir to record the new commit you check out
$ git add bar
$ git commit -s -m 'bar: Add submodule to bar v5.0.3 tag'
Let's consider a concrete case in updating a submodule to point to a new commit: let's update the OPAL hwloc component to point to a new hwloc Git tag (e.g., hwloc had a new release and we want to move our submodule to point to the tag of that release):
$ cd PATH_TO_OPEN_MPI_GIT_REPOSITORY
# Change into the directory of the submodule
$ cd opal/mca/hwloc/hwloc2/hwloc
# Check out the new commit (e.g., tag) that you want
$ git checkout hwloc-2.0.2
# Now cd back into the main Open MPI repository
$ cd ..
# Make a branch in the Open MPI repo
# (because this will turn into a pull request)
$ git checkout -b pr/update-hwloc-to-2.0.2
# Git add the "hwloc" dir to record the new commit you check out
$ git add hwloc
$ git commit -s -m 'hwloc: Update submodule to hwloc-2.0.2 tag'
Then push the branch (pr/update-hwloc-to-2.0.2
) to Github and make a pull request, as normal.
When that PR is merged, others will use the methods above to update their local submodule pointers to point to the change you just made.
git grep string_to_grep_for
is very handy for looking up strings that may appear both in source code (which is visible to many source code browsers) or in other non-source files present in the repository.
To recurse a git grep
search through submodules of the repo do this
git grep --recurse-submodules string_to_grep_for
...to be written...