-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
soter: Fix misuse of OpenSSL API for EC key generation (and more) #875
Conversation
Turns out, Soter generates EC keys incorrectly, by misusing OpenSSL API. It somehow works out for OpenSSL 1.1, but in OpenSSL 3.0 this fails. I suspect the issue stems from a misconception that EVP_PKEY_CTX is a container for generated keypair as EVP_PKEY. It is not. While it's true that EVP_PKEY_CTX_new() accepts an EVP_PKEY which can then be retrived using EVP_PKEY_CTX_get0_pkey() -- this is not a keypair. For EC keys this EVP_PKEY is "key parameters" (the curve, etc.) Generated keypair is output into a *different* EVP_PKEY, and it should be generated with a different function from that Soter uses. The same is mostly true for RSA as well. While it *happens so* that generated RSA key gets placed into EVP_PKEY_CTX, it should be a separate EVP_PKEY object. It works out only because RSA does not have parameters. Fixing this thing is going to be involved. The first bunch of patches only divorces Soter from the notion that EVP_PKEY_CTX is a container for keypairs and introduces EVP_PKEY fields for storing generated keypairs, independently from EVP_PKEY_CTX used to generate them. This patch in particular takes care of the sign/verify context. First we just add the field, initialize it constructors, and free it in destructors of soter_sign_ctx_t.
Instead of throwing away the EVP_PKEY we have imported, keep it around in our context. EVP_DigestVerifyInit grabs its own reference.
Instead of retrieving the EVP_PKEY from EVP_PKEY_CTX, use the one we have retained in our context.
In fact, don't construct it at all. We still have a field that we need to initialize. (Well, actually we don't since it's already NULL because the entire struct is allocated with calloc(), but hey, consistency.)
Now that EVP_PKEY_CTX is gone from there, we no longer need this label.
soter_verify_init_ecdsa_none_pkcs8() simply overwrites previous values of EVP_PKEY and EVP_PKEY_CTX with new ones, without freeing whatever was there. This is because it's expected to be called only one on a newly allocated instance of soter_sign_ctx_t. Make this expecatation explicit.
Moar defensive coding here. All other functions check "ctx != NULL" and this one dereferences this pointer in the first line.
As a couresy, mostly. OpenSSL will still fail with such values, but we can return SOTER_INVALID_PARAMETER instead. (Though, I belive that all upstream callers of these functions already check for these. If so, the compiler would be free to optimize those checks during inlining.)
Unfortunately, soter_sign_ctx_t is reused for both ECDSA and RSA signatures, leading to a possibility of mismatch. I believe that upstream callers already check that keys used are consistent with signature type in Soter containers, but let's be double-sure.
Well, EC code path already does this correctly so we only simplify the code here. EVP_DigestVerifyFinal() returns 1 for successful signature check, 0 for failed signature check, or a negative value for "a more serious error". This switch handles it correctly, but we can simply check for 1.
While EVP_DigestVerifyInit() and EVP_DigestVerifyUpdate() are not documented to return values other that 1 or 0, let's check for 1 specifically, consistent with EVP_DigestVerifyFinal() which *is* documented to return negative values for errors.
Same thing as in EC code paths, instead of throwing it away, retain the EVP_PKEY we have imported. The following commits too mirror the ones from the EC code path.
Similarly to EC code path, check that EVP_DigestVerifyFinal() is successfuly correcly: by comparing the result to 1. Note that the current check is incorrect and might be a source of security issues, as it treats some unsuccessful results incorrectly as successful.
Now dealing with the signing code path. This time starting with RSA because it's a bit easier. Retain that EVP_PKEY that we import from caller-provided key material, like we did for verify. But note that this does *not* strictly concern the generated key. (Though right now it's retained as well.)
Instead of using EVP_PKEY_CTX as a container for EVP_PKEY to be filled in during key generation, accept an out-parameter for EVP_PKEY, following the signature of EVP_PKEY_keygen() which is actually used for generating the key (and gets passed the out-parameter now). Currently the NULL checks we perform on dereferenced EVP_PKEY do not make much sense, but I'm retaining them until a later point where they could be safely removed.
Now that both key generation and importing place their EVP_PKEY into our context field directly, stop using the reference from EVP_PKEY_CTX.
EVP_PKEY_CTX is used only for key generation, there is no need to retain it afterwards. Pull it out into a local variable. Note that this means we'd have to free it on both the success and failure code paths now. Previously, the destructor did it for us.
Start preparing the key generation function soter_rsa_gen_key() to handle *all* work related to generating RSA keys, instead of accepting a partially configured EVP_PKEY_CTX and EVP_PKEY and then finishing key generation. I'll need this function to handle resource management correctly so start introducing the "goto err" pattern by making sure that returned status code is passed via a variable.
Start building the error handling code path here by introducing an "err" label to which we'll jump on error. "pub_exp" needs a special treatment here since it's passed to the EVP_PKEY_CTX. So if we fail to pass it then we free it, but otherwise we must NOT free it again on future error handling. Thus reset it to NULL (which is safe to pass to BN_free()).
Now that "goto err" path is in place, use that consistently instead of ad hoc calls to BN_free() and returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great job and refactoring error handling to be more precise and correct.
but I don't understand what actually wrong with our code. we don't use empty uninitialized EVP_PKEY structures, we specify curves and params where it needs. But always extract keys from CTX structure, instead of direct usage. But CTX and assigned keys initialized properly, as I understand. Can you correct me or explain a bit more?
@@ -54,7 +54,7 @@ struct soter_asym_ka_type { | |||
}; | |||
|
|||
struct soter_sign_ctx_type { | |||
EVP_PKEY_CTX* pkey_ctx; | |||
EVP_PKEY* pkey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct me if I am wrong. as I understand, we stop storing EVP_KEY_CTX because if used only once for key generation and initialization and after that we don't use it directly, only to extract private/public key from this private structure. And now we store EVP_PKEY, which can store private and public keys, that are only used in all further function calls, yes?
and that is main reason, and we stop store more than we need data, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we stop storing EVP_KEY_CTX because if used only once for key generation and initialization and after that we don't use it directly
Yes.
And now we store EVP_PKEY, which can store private and public keys, that are only used in all further function calls, yes?
Correct.
and that is main reason, and we stop store more than we need data, correct?
Yes and no. We don't need to free EVP_PKEY_CTX right away but we don't have to retain it either. That said, it is a bit more heavy data structure than EVP_PKEY, but not very much.
As you said, all following methods on soter_sign_ctx_t
do not use EVP_PKEY_CTX. We only need it for key generation, which is performed exactly once on initialization (at least for soter_sign_ctx_t
). After that we don't need the context around, only the keypair (or the public key alone). It's not possible to, say, generate a new keypair once again, like soter_asym_ka_alg_t
allows. In that case it could be viable to retain a preconfigured EVP_PKEY_CTX that can be used for key generation again and again. However, even then I'd retain only generated parameters and create EVP_PKEY_CTX on demand.
The main motivation for removing EVP_PKEY_CTX is to prevent the flawed notion from spreading further. EVP_PKEY_CTX is not a general-purpose container for EVP_PKEY, but rather a transient helper. It matters how EVP_PKEY_CTX is created, what for it has been initialized, etc. It's much easier to check correctness of EVP_PKEY_CTX handling when it's constrained to one function, not retained in an object with multiple methods that can affect and reconfigure the context.
You see, OpenSSL is like ceremonial magic. You must follow the proper sequence to the letter. “I use only clean robes, utter the Duke's name loudly when needed, and always purify my salts at the Church instead of doing it myself. The ritual is performed properly, as I understand.” You mess up just one insignificant thing and invocation won't work. Or will appear to work but your soul will be forfeited in five years. And of course nobody is going to lay it out for you neatly or tell you what's wrong. You have to learn it all yourself, write it down in your grimoire—preferably in code—and then never share this dangerous knowledge with anyone else. Except maybe your disciple (of which you shall have no more than one), and only if he proves worthy. So... back to more serious mood. The key generation sequence was not really documented before OpenSSL 3.0 (and in OpenSSL 3.0 it lacks any detailed error strings except for return codes). But the core thing is that you're supposed to use separate EVP_PKEY_CTX instances for parameter generation and key generation. Moreover, instead of low-level API for EC key generation – EC_KEY_generate_key(), which got deprecated in OpenSSL 3.0 – one should use high-level API for key generation. The last point is not critical, but given that RSA code path uses the high-level API, I took the liberty of unifying the code paths. This is how the old code does it: #include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <openssl/ec.h>
#include <openssl/err.h>
#include <openssl/evp.h>
static void die(const char* fmt, ...)
{
va_list args;
va_start(args, fmt);
vfprintf(stderr, fmt, args);
va_end(args);
ERR_print_errors_fp(stderr);
exit(1);
}
int main()
{
EVP_PKEY* pkey = NULL;
EVP_PKEY_CTX* pkey_ctx = NULL;
EC_KEY* ec = NULL;
/*
* Parameter generation
*
* soter_sign_init_ecdsa_none_pkcs8()
*/
pkey = EVP_PKEY_new();
if (!pkey) {
die("[*] EVP_PKEY_new() failed\n");
}
if (!EVP_PKEY_set_type(pkey, EVP_PKEY_EC)) {
die("[*] EVP_PKEY_new_set_type() failed\n");
}
pkey_ctx = EVP_PKEY_CTX_new(pkey, NULL);
if (!pkey_ctx) {
die("[*] EVP_PKEY_CTX_new() failed\n");
}
if (!EVP_PKEY_paramgen_init(pkey_ctx)) {
die("[*] EVP_PKEY_paramgen_init() failed\n");
}
if (!EVP_PKEY_CTX_set_ec_paramgen_curve_nid(pkey_ctx, NID_X9_62_prime256v1)) {
die("[*] EVP_PKEY_CTX_set_ec_paramgen_curve_nid() failed\n");
}
if (!EVP_PKEY_paramgen(pkey_ctx, &pkey)) {
die("[*] EVP_PKEY_paramgen() failed\n");
}
/*
* Keypair generation
*
* soter_ec_gen_key()
*/
pkey = EVP_PKEY_CTX_get0_pkey(pkey_ctx);
if (!pkey) {
die("[*] EVP_PKEY_CTX_get0_pkey() failed\n");
}
if (EVP_PKEY_EC != EVP_PKEY_id(pkey)) {
die("[*] EVP_PKEY_id() failed\n");
}
ec = EVP_PKEY_get0(pkey);
if (NULL == ec) {
die("[*] EVP_PKEY_get0() failed\n");
}
if (1 != EC_KEY_generate_key(ec)) {
die("[*] EC_KEY_generate_key() failed\n");
}
fprintf(stderr, "[+] EC key generation successful!\n");
return 0;
} This is how the new code does it: #include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <openssl/ec.h>
#include <openssl/err.h>
#include <openssl/evp.h>
static void die(const char* fmt, ...)
{
va_list args;
va_start(args, fmt);
vfprintf(stderr, fmt, args);
va_end(args);
ERR_print_errors_fp(stderr);
exit(1);
}
int main()
{
EVP_PKEY_CTX* param_ctx = NULL;
EVP_PKEY* param = NULL;
EVP_PKEY_CTX* pkey_ctx = NULL;
EVP_PKEY* pkey = NULL;
/*
* Parameter & keypair generation
*
* soter_ec_gen_key()
*/
param_ctx = EVP_PKEY_CTX_new_id(EVP_PKEY_EC, NULL);
if (!param_ctx) {
die("[*] EVP_PKEY_CTX_new_id() failed\n");
}
if (EVP_PKEY_paramgen_init(param_ctx) != 1) {
die("[*] EVP_PKEY_paramgen_init() failed\n");
}
if (EVP_PKEY_CTX_set_ec_paramgen_curve_nid(param_ctx, NID_X9_62_prime256v1) != 1) {
die("[*] EVP_PKEY_CTX_set_ec_paramgen_curve_nid() failed\n");
}
if (EVP_PKEY_paramgen(param_ctx, ¶m) != 1) {
die("[*] EVP_PKEY_paramgen() failed\n");
}
pkey_ctx = EVP_PKEY_CTX_new(param, NULL);
if (!pkey_ctx) {
die("[*] EVP_PKEY_CTX_new() failed\n");
}
if (EVP_PKEY_keygen_init(pkey_ctx) != 1) {
die("[*] EVP_PKEY_keygen_init() failed\n");
}
if (EVP_PKEY_keygen(pkey_ctx, &pkey) != 1) {
die("[*] EVP_PKEY_keygen() failed\n");
}
fprintf(stderr, "[+] EC key generation successful!\n");
return 0;
} If you compile and run these snippets with OpenSSL 1.1.1 and 3.0 you will observe that old version fails on For reference, you can look at the 3.0 manpages for key generation then compare them to 1.1.1 version, as well as read through the issues linked above (openssl/openssl#11990). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. soter's code becomes better and more correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ignatk we'd really appreciate if you could take a look here as well
Hello, that's As I see many changes related to core, but all unit tests passing, I'd recommend to run a cross-platform tests. AFAIK, we run cross-platform and cross-themis-version tests on the internal BuildBot only. I'd recommend merging #875 and #876 to a separate branch Cases I'd like to spot early on: changes in SC/SM that are compatible for these PRs, but incompatible for @ilammy WDYT? |
{ | ||
/* it is copy-paste from /src/soter/openssl/soter_asym_cipher.c */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not anymore? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not anymore? :)
You know the story about the ship of Theseus?
Right now soter_asym_cipher.c
does not even have generation in it. Whatever there was, this implementation does not really have much left in common with it, I think, so the comment only misleads the reader.
res = SOTER_SUCCESS; | ||
|
||
err: | ||
BN_free(pub_exp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BN_free(pub_exp); | |
pub_exp = NULL; | |
BN_free(pub_exp); |
shouldn't we also null the pub_exp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we also null the pub_exp?
I don't think there is much point in doing that to a local variable on the cleanup code path. For a field – sure, that keeps invalid pointers out. But here the variable will cease to exist with return
, so might as well spare the work.
goto free_md_ctx; | ||
} | ||
|
||
EVP_PKEY_free(pkey); | ||
return SOTER_SUCCESS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we free anything temp before returning success?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we free anything temp before returning success?
It's not a temporary though, this EVP_PKEY is owned by ctx
. Ditto for RSA.
goto free_md_ctx; | ||
} | ||
|
||
EVP_PKEY_free(pkey); | ||
return SOTER_SUCCESS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same Q: is there anything to clean up in case of success?
switch (res) { | ||
case 0: | ||
return SOTER_INVALID_SIGNATURE; | ||
case 1: | ||
return SOTER_SUCCESS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ouch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC it could have returned <0
for other failure (invalid params etc), so the idea here was to tell apart invalid input parameters (bug?) vs invalid signature
I don't mind particularly, but I think that's would be busywork for a change that is supposed go into In my opinion, if this change were truly experimental, potentially destabilizing, and/or requiring a long tail of extra work – yeah, then it totally makes sense to put it into a separate branch with might or might not get merged into master by 0.14, depending on how good it is from security and QA standpoint. But if all this is more or less complete changeset and not really expected to fail the cross-tests, then going through a separate branch is a good discipline at best, but does not really save time or effort. Neither for us – look how much we spent in this thread – nor for users, who aren't going to be affected much in case the change is actually flawed and would need to get reverted. It's not like we promise That said, it's more of a discussion on how future development should proceed. Are we going to treat non-trivial Themis Core changes (i.e., more involved that code style fixes) as potentially radioactive and not allowed into master until cross-test regression confirm they look clean? In that case, I think, this should be codified somewhere to avoid having this discussion on case-by-case basis. |
Ok, I understand your point, makes sense. I'll talk to @shad to see what's the best way of testing from CI perspective. I'm sure we can test |
@ilammy do you think these changes should be part of 0.14.0? |
Pretty much. I don't see a downside – if the changes work – and I don't see an upside either from delaying the merge. OpenSSL 3.0 will still remain unsupported, but it will be not supported better. OpenSSL 1.1.1 gets some more error checks. Win-win, IMO. |
FYI, there's now the |
} | ||
if (1 == EC_KEY_generate_key(ec)) { | ||
return SOTER_SUCCESS; | ||
if (EVP_PKEY_paramgen(param_ctx, ¶m) != 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my original preference was to have constants in comparison operators first, but if the current code style changed - I won't fight for it
res = SOTER_SUCCESS; | ||
|
||
err: | ||
EVP_PKEY_CTX_free(param_ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume these functions check for NULL
internally, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume these functions check for NULL internally, right?
That is so. They follow the free() convention: NULL argument is a no-op.
switch (res) { | ||
case 0: | ||
return SOTER_INVALID_SIGNATURE; | ||
case 1: | ||
return SOTER_SUCCESS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC it could have returned <0
for other failure (invalid params etc), so the idea here was to tell apart invalid input parameters (bug?) vs invalid signature
Turns out, Soter generates EC keys incorrectly, by misusing OpenSSL API. It somehow works out for OpenSSL 1.1.1 (and earlier), but in OpenSSL 3.0 this fails. This ginormous PR improves the situation a bit by using proper key generation sequence which allows Secure Message sign/verify mode to work with OpenSSL 3.0. Collateral damage includes some extra checks and cleanups along the way.
This change was made in an autistic fit of spite against OpenSSL 3.0 for which I am sorry. I also apologize for such a long PR, I didn't have time to write a short one. I wish you happy code review. When you see 78 commits, 11 files, and 500+ lines changed – you know that's just the right size /s
Given the scope, I'd wait for at least
threetwo sign-offs on this before merging.Background
We're not the first ones, and the issue probably ultimately stems from undoubtedly high quality of OpenSSL documentation, but nevertheless. See these issues:
Soter makes the same mistake. Thrice.
Impact
The code changes may affect the following systems:
But actually this should be a change with no functional impact.
OpenSSL 1.1.1 and OpenSSL 1.0.2 operation is unaffected. (CI does not check OpenSSL 1.0.2 but I can confirm that Soter & Themis Core test suites pass with it.)
With OpenSSL 3.0 the test failure rate improves a bit.
Soter test suite:
Themis Core test suite:
Additional notes
Properly using OpenSSL API for EC key generation appears to trigger #657 on CI with legacy Node.js 8.x. I have applied the suggested workaround (switch to BoringSSL).
Future work
This PR fixes one instance, on the code paths that affect sign/verify mode of Secure Message. The tests still fail with OpenSSL 3.0 as the key agreement code paths need to be corrected in the same manner, but I don't have enough spoons to do that now (and you probably don't want to get 40 more commits here).
Additionally, it would probably be nice to port all this to BoringSSL code as well, but it seems that it's currently more lenient, like OpenSSL 1.1.1 is.
Checklist