Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize integrity checking and implement it for git origins #210

Merged
merged 8 commits into from
Sep 28, 2019

Conversation

mosteo
Copy link
Member

@mosteo mosteo commented Sep 26, 2019

This PR generalizes the archive-hash into origin-hashes and implements integrity checking for git origins. Functional changes are:

  • Use origin-hashes, that applies to all origins, instead of archive-hash, which only applied to archive sources.
  • Use a vector instead of a single value to allow supplying more than one hash (and verify with all that are given for a release).
  • Use the git archive | shaXsum trick for platform-independent verification of git remotes.
  • Use coreutils shaXsum (since we rely on it for git) also for source archives.

Unless mistaken, this covers the basic integrity verification that we wanted for the beta. I added hg/svn to #198 to track their progress. I also created #208 and #209 to track related improvements.

The verification of downloaded info is generalized for all origins,
and also we allow having more than a single hash.
* ``archive-hash``: mandatory string for source archives. A "kind:digest" field
that specifies a hash kind and its value. The only accepted kind is SHA512 at
this time.
* ``origin-hashes``: mandatory string for source archives, optional for git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it should be optional for git.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In accordance with the comment in #210. OK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made it a warning for now, with the idea of turning it into an error once the current index is completely updated.

this time.
* ``origin-hashes``: mandatory string for source archives, optional for git
origins. An array of "kind:digest" fields that specify a hash kind and its
value. Kinds accepted are sha256, sha384, sha512.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why accept multiple kinds? Let's take the most secure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So only sha512 it is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest push

Exit_Code : Integer;
begin
OS_Lib.Subprocess.Raw_Spawn
(Program => "sh",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to use the hash functions provided in libGNAT. There would be less dependency on external tools.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we wanted this for simplicity. Will revert the change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest push.

Output : Utils.String_Vector;
Exit_Code : Integer;
begin
Raw_Spawn (Program => "sh",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use sh here? Is it possible to spawn git directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know a way to pipe stdout of a subprocess to stdin with GNAT.Os_Lib, but I might have missed it. Do you know if GNAT has portable support for some kind of pipes?

Alternatively, we can use a temporary file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with temporary file (commit 7f10d7c).

.Empty_Vector
.Append ("-c")
.Append ("git -c core.autocrlf=false archive HEAD | "
& Hasher & " | cut -f1 -d\ "),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like above, we should use libGNAT to produce the hashes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted.

Archive_Name : constant String := This.Base.Archive_Name;
Archive_File : constant String := Dirs.Compose (Folder, Archive_Name);
begin
return Hashes.Digest (Hashes.Compute.Hash_File (Kind, Archive_File));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure, is this doing a hash of the archive before it is extracted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

src/alire/alire-origins-deployers.ads Show resolved Hide resolved
Being subtypes of string made too easy to mix them up
And refactored Source_Archive to fit in
Using GNAT.OS_Lib.Argument_String_To_List seems to fail on strings
with pipes, so we need a way of directly passing such arguments to
an OS shell
This way we can test integrity verification of VCSs without needing
internet connectivity.
end if;

-- Compute hashes from downloaded release and verify:
for Index_Hash of This.Base.Data.Hashes loop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if multiple hashes are provided, all of them must be correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We only support one now, but I think it may come in handy for future-proofing.

Copy link
Member

@Fabien-Chouteau Fabien-Chouteau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job @mosteo, this is a big improvement.

mosteo added a commit to alire-project/alire-index that referenced this pull request Sep 28, 2019
We still need this to merge alire-project/alire#210 and then we can add the missing hashes progressively.
@mosteo mosteo merged commit 7228190 into master Sep 28, 2019
@mosteo mosteo deleted the feat/hash branch September 28, 2019 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants