-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🪄 Schema version 6 wishlist #108
Comments
This wouldn't save space, but currently when RHEL and Mariner feeds report a package as "not affected" we just drop the record in vunnel. It would be helpful if this was instead expressed in the database. (Might overlap with the |
v6 should have some way of looking up the correct namespace off of something more than just Keeping it in the db would mean it could be updated automatically as new namespaces are added and grype could make use of it for lookups immediately whereas maintaining a static mapping in grype means we'd need to remember to maintain that mapping in multiple places (vunnel and grype), and users would need to upgrade to the latest grype for newer namespaces |
The ability to know when a particular record was added or modified within the grype database came up in a community discussion. Although we are currently always building up the database from scratch, I think it may make sense to include something like an |
Updated the top comment to point to anchore/grype#1498 as a specific issue for "Capture dates where available". |
We need the ability to represent a CVE that affects different packages, but has different severity ratings for each package. As a concrete example, https://security-tracker.debian.org/tracker/CVE-2023-44487 lists a table with multiple severities for different packages. Here's a snippet of the relevant table on that page, in case it changes or moves:
Note the "urgency" is marked as "unimportant" on the row for Currently, in the database, this CVE is represented like this: -- VULNERABILITY TABLE
sqlite> select id, package_name from vulnerability where
id="CVE-2023-44487" and namespace="debian:distro:debian:12";
id package_name
-------------- -------------
CVE-2023-44487 h2o
CVE-2023-44487 haproxy
CVE-2023-44487 jetty9
CVE-2023-44487 netty
CVE-2023-44487 nghttp2
CVE-2023-44487 nginx
CVE-2023-44487 tomcat10
CVE-2023-44487 tomcat9
CVE-2023-44487 trafficserver
-- VULNERABILITY_METADATA table
sqlite> select id, severity from vulnerability_metadata where
id="CVE-2023-44487" and namespace="debian:distro:debian:12";
id severity
-------------- ----------
CVE-2023-44487 Negligible As you can see, the database has no good way of writing down, "this CVE is more severe if matched against tomcat than against nginx," but Debian's data is clearly trying to tell us that. I believe this would be fixed by having a proper foreign key from vulnerability to vulnerability_metadata, rather than just relying on ID+Namespace to match. We could also move the severity column to the vulnerability table, but that would probably result in a lot of duplicate values. |
We need to better capture the relationships between identifiers between ecosystems. Currently, we have the |
Tables per provider/ecosystem pair with a schema specific to the ecosystem |
Signed-off-by: Alex Goodman <[email protected]>
@wagoodman, I was just wondering in light of all of the CDN issues that have been cropping up if it might make sense to do something like partition v6 databases per provider and then perhaps make grype smarter about what it downloads based on what it needs? Like we know currently the sles data far outnumbers the rest but if someone is never scanning sles containers there would be no need to ever fetch that subset of the data in their ci pipelines running grype Anyways, just a very rough thought and apologies if this has already been discussed elsewhere. I'm "on vacation" so haven't yet seen all of the discussion that may have occurred on Friday. |
@westonsteimel we were talking about that yesterday. There might be some big performance gains by splitting the DB by provider. The main drawback I see is that, right now, grype does 2 relatively slow things at the same time: it generates the SBOM, and it downloads the database. But I don't think it can know what database pieces to download before the SBOM is generated. Maybe we could do something like check the distro early, and then download the database for the distro while making the rest of the SBOM. We've also talked about trying to make incremental updates to the database available, but I think partitioning by provider would be much simpler to implement. |
I'm going to close this since this wishlist has been converted into tangible issues for v6 |
As these are implemented, please edit this field to include the PR that implements it within the wishlist below:
Consider switching json column processing to bson or another format that is space-efficient and potentially more performant to parsejson compresses better than binary data, and we already have a compressed archive, so no benefit of changing thisversion_constraint
andversion_type
columns and consolidate into the newpackage_qualifiers
columnid
tableRecordSource
field from vulnerability metadata (indicates the feed group, which is now fictitious)disputed
state for vulnerability without conflating withfixed state
Focuses:
The text was updated successfully, but these errors were encountered: