-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Additional Metadata Attributes #12174
base: dev
Are you sure you want to change the base?
Conversation
@joelverhagen @JonDouglas |
|
||
## Rationale and alternatives | ||
|
||
While there are no known alternatives, we have previously considered embedding custom files in the package containing this metadata. This would be of some benefit, but ultimately supporting search queries is necessary for achieving the full benefit of the proposal for the scenarios described. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to create well-known tags for this purpose? We've extension points in the past, based on tag. Here's a prototype I tried on our DEV environment:
https://dev.nugettest.org/packages?q=Tags%3A%22attr_fruit%3Alemon%22
Miraculously, our quotes actually work properly here 😂.
Prior art is "AzureSiteExtension" used for finding Azure site extension packages, before the package type filtering was enabled:
https://www.nuget.org/packages?q=Tags%3A%22AzureSiteExtension%22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well-known tags could potentially work... I suppose there's not really any more enforced convention with arbitrary metadata key/value pairs... Though it doesn't seem like there would be currently a way to search with multiple tag combinations with "AND" (eg: Tags:"attr_fruit:lemon" Tags:"ArtifactId:NONE"
returns results matching just one of the tags, not results only matching both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the AND/OR combination on NuGet.org search is ... not great :)
For non-field-scoped terms foo bar
it performs an AND. For multiple field-scoped terms tags:foo tags:bar
it performs an OR. For a mixture, at least one of the field-scoped terms per field name must exist in the doc.
https://www.nuget.org/packages?q=owner%3Amicrosoft+owner%3Ajver+tags%3Aentity+tags%3Afoo+design
I think a reasonable step here could be to change the interaction of field-scoped terms to "AND" to unblock this scenario. I think it would be a net win for general usage of field-scoped terms anyway since it would align with the non-field scoped term behavior.
The history here is that we have invested heavily in relevance on non-field scoped queries since they are the 99% case. We have not done the same investment for field-scoped queries or other advanced syntax like +
(not supported), -
(not supported), "
(acts weird).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How valuable/important would the combination of atributes/tags be?
How many scenarios would a single attribute/tag solve?
|
||
Inside of the .nuspec file's `<package>` and then `<metadata>` elements, create a new `<attributes>` element which can contain zero or more `<attribute key="[string]" value="[string]" />` elements. Attributes must have unique key values and there cannot have more than one attribute with the same key value. | ||
|
||
In the NuGet search query (`q`) parameter, allow attributes to be specified as a query filter just like `owner`. That is, for example: `q=attr_[keyValue]:[attribute_value]` where the `attr_` prefix denotes matching a particular attribute key by its `[attribute_value]`. The search should look for exact, case-insensitive matches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stuff inside the q
(search text) parameter is not spec'd at the protocol level. So different package source implementations will interpret this differently. AND/OR logic, ranking, quote behavior, supported field-scoped terms, etc. These are all package source specific. In general the q
property is for search relevance and less about strict package filtering. It's certainly a grey area since it is unspec'd but I think a safer approach is to introduce a new query parameter for these attribute filters.
|
||
These attribute key/value pairs would be searchable within the NuGet search service, via the query property, similar to how search by `owner` or `packageid` is available currently. | ||
|
||
### Technical explanation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone wants to filter by multiple key-value pairs, is that possible? AND or OR behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I hadn't realized, there does not seem to be a way to "AND" query terms... This would definitely be helpful, though I guess narrowing search results to a potentially reasonable number (ie: GroupId, or having a concatenated MavenId field) and inspecting the details of the results might be reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nuget.org search is less concern (for me)
https://www.nuget.org/packages?q=artifact
Huge help would be metadata + API for maintaining 450+ artifacts.
|
||
Being able to cross-link native dependency identities against existing nuget package references would help in creating experiences that automatically resolve and link in the correct set of build time dependencies across native and nuget assets. | ||
|
||
Example: Maintaining a [list of popular known packages that map to maven artifacts](https://github.com/Redth/Xamarin.Binding.Helpers/blob/main/Xamarin.Binding.Helpers/NuGetResolvers/KnownMavenNugetResolver.cs#L12-L99) is not a scalable solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible for NuGet.org or a community member to build their own index based on NuGet.org packages using the V3 catalog. There is a guide on using the V3 catalog here: https://learn.microsoft.com/en-us/nuget/guides/api/query-for-all-published-packages
So the "productionized" version of this map would be to write catalog reader that looks at each published package, checks if there is a cgmanifest.json
, then add it to an index. Surface the index on an independent web service. This allows custom projects/views of NuGet.org without the need to block on official service or client support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting, I didn't realize this was available... We don't currently publish the manifest in the packages (I don't think anyway), but looking at this sort of approach might mean we could create our own conventions - the problem is that unless they are 'officially' supported, conventions can be hard to gain traction with.
Identity can consist of multiple attributes, for example a Maven package has: | ||
|
||
- Group Id (eg: `com.company.product`) | ||
- Artifact Id (eg: `ProductSdk`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, is this spec suggesting the same model as Maven? I'm wondering if extensions or next steps for the Maven model ever came up. For example: what about values with non-string types allowing range queries ("os_version:10" and querying "os_version>=10") or allowing multiple attributes with the same key but different values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I fully understand the question here, could you elaborate? It might be nice to support additional types and query operators if that's the basic question? I guess if something is being considered, may as well consider all possible useful cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the confusion.
From your spec, it sounds like Maven already has a feature like what you're suggesting. Given that, can we learn from any growing pains their ecosystem had? For example, did they originally ship with the string KVP model with unique keys, then run into problems that necessitated a richer data model?
Said another way, if we can learn from another ecosystem's implementation in this area, we can maybe skip some intermediate steps or painful migrations. Or we can know that what we're proposing here is actually enough.
I'm not clued into the Maven ecosystem so I can't provide that perspective.
As a side note, if there are "prior art" design spec/docs about this feature in other ecosystems that would be cool to link in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I'm not really that familiar with how they built up their model.
Having the ability to query on the version as a version number would be useful in an "AND" query scenario, I had just considered the MVP implementation of this not needing to do that since the result set matching only the GroupId
and ArtifactId
would presumably be small enough to iterate over versions of the results (also if it wasn't clear, different NuGet package versions would potentially have different attr_MavenVersion:1.2.3
values). The one potential gotcha here is that Maven's versioning rules might be a bit different than NuGet (though I think semver is adopted there too).
One other consideration though is the maven version might be used to assert if it satisfies a Maven version range - again, similar to NuGet's version range semantics, but not necessarily identical in rules/implementation. For the binding helpers project/experiment I linked in the proposal, this is part of the process, so in this case the matching GroupId/ArtifactId
results would still need to be iterated over, asserting each version's maven range compatibility. Long way to say that there's maybe too many operators to consider for querying by version to make the effort of adding some simple >= particularly useful in value? This is just one example though, and maybe that would be valuable for other scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for coming late to the show.
I maintain:
-
AndroidX (AX)
-
GooglePlayServices, Firebase, MLKit (GPS-FB-MLKit)
And when I get some air I work on "bindings improvements" which should improve productivity and more.
So, I will express my opinion only for Android (.NET for Android - formerly Xamarin.Android), though
IMO this should be extended to .NET for iOS and maybe other platforms.
"Bindings improvements" include
-
discoverability/identification (both Maven and NuGet)
-
dependency graphs (both Maven and NuGet)
Here I work on porting my graph theory library from c++ to c# and thanks to Generic Math it is
a lot easier nowdays.https://devblogs.microsoft.com/dotnet/dotnet-7-generic-math/
https://devblogs.microsoft.com/dotnet/preview-features-in-net-6-generic-math/I need formal methods to build graph/trees, find leaves (all nodes) and detect dependency cycles
for both Maven and NuGet.
I am already using some of the utilities in our repos for
-
building
cgmanifest.json
https://github.com/xamarin/AndroidX/blob/main/cgmanifest.json
https://github.com/xamarin/GooglePlayServicesComponents/blob/main/cgmanifest.json
-
detecting dates when NuGet packages were published
Up to recently we have added Maven fully qualified metadata for artifact in 2 forms:
artifact=androidx.compose.material:material-ripple
artifact_versioned=androidx.compose.material:material-ripple:1.0.5
to nuget fields
- Description
- Summary (sometimes)
- Tags
Visible here:
This was OK and I am able to use server side NuGet protocol (HttpClient + JSON/XML parsing) to increase productivity of maintenance on both of repos.
Last updates .NET for Android team decided to keep this information only in Tags
node.
-
identifying binaries used (either distributed or downloaded by the package during the build)
-
dependency identification
- type
- maven
- native
- identity
- version
- type
With this there would be 1:1 mapping from NuGet package (versioned) to Maven/Native package (versioned)
This would help maintainers with
-
keeping track of published (bound Maven or native libraries),
Getting data for latest nuget package and mapping it to maven fully qualified versioned id
would ease discoverability what is to be updated. -
updates and
see 1.
-
troubleshooting
Primarily checking dependency graphs
- for duplicate transitive dependencies (possibly with different versions)
-
security checks (component governance)
-
curation (currated package publishing)
With lowering the bar for bindings via "bindings improvements" it is to be expected to have
flood of bindings packages.
NuGet publishing proces could add step to verify if given Maven/Native artifact is already
published in some other NuGet package.
-
Maven
-
project
<ItemGroup> <PackageAttribute Include="maven.GroupId" Value="androidx.activity" /> <PackageAttribute Include="maven.ArtifactId" Value="activity" /> <PackageAttribute Include="maven.VersionId" Value="1.6.0" /> </ItemGroup>
NOTE: this could be derived from curernt (and future) .NET for Android (Xamarin.Android)
BuildActions for binding artifacts (Embedd
) -
nuspec
<!-- ... snip --> <package> <metadata> <attributes> <attribute key="maven.GroupId">androidx.activity</attribute> <attribute key="maven.ArtifactId">activity</attribute> <attribute key="maven.Version">1.6.0</attribute> </attributes> </metadata> <!-- ... snip --> </package>
-
-
Native
-
project (packaging)
<ItemGroup> <AndroidNativeLibrary Include="path/to/libfoo.so"> <Abi>armeabi</Abi> </AndroidNativeLibrary> </ItemGroup>
-
nuspec
<package> <metadata> <attributes> <attribute key="native.LibraryName">libfoo</attribute> <attribute key="native.Version">1.6.0</attribute> </attributes> </metadata> </package>
-
|
||
The NuGet search API already allows the specification of various package [metadata fields to search by in the query parameter](https://learn.microsoft.com/en-us/nuget/consume-packages/finding-and-choosing-packages#search-syntax). This proposal is simply an extension of that existing query syntax to include additional, potentially arbitrary attributes both in the .nuspec format as well as the search query. | ||
|
||
### Functional explanation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any considerations beyong discoverability?
Maybe something specific to management some of these related packages within your project or is that not a big concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any considerations beyong discoverability?
1:1 mapping of native (maven or native lib) to nuget would make
- security checks easier
- curation (optional) easier
Maybe something specific to management some of these related packages within your project or is that not a big concern?
We do that, but formal/standardized/central method would help, both us and (IMO) nuget team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now thinking a bit deeper 1:1 is oversimplification. In cross platform scenario there will be artifact per platform (Android, iOS) and sometimes multiple artifacts per platform.
This PR has been automatically marked as stale because it has no activity for 30 days. It will be closed if no further activity occurs within another 15 days of this comment. If it is closed, you may reopen it anytime when you're ready again, as long as you don't delete the branch. |
This PR has been automatically marked as stale because it has no activity for 30 days. It will be closed if no further activity occurs within another 15 days of this comment, unless it has a "Status:Do not auto close" label. If it is closed, you may reopen it anytime when you're ready again, as long as you don't delete the branch. |
This PR has been automatically marked as stale because it has no activity for 30 days. It will be closed if no further activity occurs within another 15 days of this comment, unless it has a "Status:Do not auto close" label. If it is closed, you may reopen it anytime when you're ready again, as long as you don't delete the branch. |
Hi, we have removed our "proposed" folder, so please move this proposal to the "accepted" folder. |
This PR has been automatically marked as stale because it has no activity for 30 days. It will be closed if no further activity occurs within another 15 days of this comment, unless it has a "Status:Do not auto close" label. If it is closed, you may reopen it anytime when you're ready again, as long as you don't delete the branch. |
Allow the inclusion of additional metadata properties in package authoring and allow them to be used in search queries.
The motivation for this is to create associations of NuGet packages which bind and/or redistribute platform native libraries on other platforms and package management systems.