Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularity of kernel patches and device-tree configurations #141

Open
juliuskoskela opened this issue May 17, 2023 · 6 comments
Open

Modularity of kernel patches and device-tree configurations #141

juliuskoskela opened this issue May 17, 2023 · 6 comments

Comments

@juliuskoskela
Copy link
Contributor

juliuskoskela commented May 17, 2023

There are some open questions and possibly overlapping design proposals on how we integrate modules involving kernel patches and device tree configurations into Ghaf.

In the current main branch the kernel and device tree are configured in the hardware module. More precisely the hardware device tree is set here (line42). Kernel patches and configurations are set in the same file (lines19-38).

The device tree is hashed and copied to /dtbs/ in the boot configuration*.

*Explain this process in more detail, possibly commenting the source code as well.

We are in the process of integrating a module that enables bpmp virtualization on the host. The module includes kernel patches, kernel overlays and kernel configurations as well as device tree configurations and it is specific to the Nvidia Jetson Orin AGX target (or any possible future targets in that family).

I'd like to propose design clarifications, documentation and possibly some refactoring as related to the above context.

Kernel patches

It is my understanding that we have two different proposals as to how we handle modules that involve kernel patches.

  1. We create an overlay which applies a kernel patch as well as kernel configuration (as a nix expression) on top of the requested kernel package (line 10) in nixpkgs. Then we extend the kernel with this overlay in our build*.

*How do we handle build-specific configurations such as this? Where do we configure them and how do we make sure different configuration options don't clash?

  1. We create a module.

Explain the reasoning behind modules and how they fit our use case?

We should evaluate these approaches, write them out and document a clean way to integrate and test kernel patch modules in Ghaf. Another question is, are this approaches overlapping or complimentary?

Kernel configurations

We propose that all kernel configurations are defined in the nix file that defines the module or the overlay.

Example:

kernel.override {
  kernelPatches = [
    # ...
  ];

  extraConfig = ''
    CONFIG_PCI_DEBUG=y
  '';
}

Device tree configurations

In current main branch some device tree configurations are added from a .dtb file in a target specific hardware module in modules/hardware.

How do we make this modular? Explore device tree overlays.

Modules for different targets

In the current main branch, target specific modules are mixed with common modules. This makes reading the repository somewhat challenging. Maybe we could move to something that separates architecture specific configurations more clearly.

ghaf/modules/
  target/
    nvidia-jetson-orin/
      bpmp-virt/            # Possibly a git submodule (separate repo)?
      ...
    generic-x86/
      ...
  common/
    module.nix
    ...

Open questions

  1. How to integrate a module?
  2. How to integrate a module involving kernel patches and/or configurations?
    2.1 Do we use modules or overlays (in Nix context)?
  3. How to integrate a module that involves device tree configurations?
  4. How to separate target specific modules?

Related

@vilvo
Copy link
Contributor

vilvo commented May 19, 2023

There are some open questions and possibly overlapping design proposals on how we integrate modules involving kernel patches and device tree configurations into Ghaf.

In the current main branch the kernel and device tree are configured in the hardware module. More precisely the hardware device tree is set here (line42). Kernel patches and configurations are set in the same file (lines19-38).

The device tree is hashed and copied to /dtbs/ in the boot configuration*.

This, having dtbs on a specific partition, is more of an NVIDIA BSP-specific case to handle device trees to support 1st and 2nd stage bootloaders which process the device trees and pass them finally to kernel in boot process. One should not generalize this on how other HW vendors handle device trees.

*Explain this process in more detail, possibly commenting the source code as well.

We are in the process of integrating a module that enables bpmp virtualization on the host. The module includes kernel patches, kernel overlays and kernel configurations as well as device tree configurations and it is specific to the Nvidia Jetson Orin AGX target (or any possible future targets in that family).

It is practical to integrate features first with such a hybrid (non-ideal) way. When we learn how the to be integrated feature works in testing, iterate it further with more patches etc., we can find better ways for next iterations. Is this way final and ideal? No, but it is practical.

I'd like to propose design clarifications, documentation and possibly some refactoring as related to the above context.

Kernel patches

It is my understanding that we have two different proposals as to how we handle modules that involve kernel patches.

Both can be valid at different stages of development and integration.

  1. We create an overlay which applies a kernel patch as well as kernel configuration (as a nix expression) on top of the requested kernel package (line 10) in nixpkgs. Then we extend the kernel with this overlay in our build*.

*How do we handle build-specific configurations such as this? Where do we configure them and how do we make sure different configuration options don't clash?

At this stage of development - cross-domain memshare PoC, patching is straight-forward because we can avoid a kernel fork that is costly to maintain (regular rebasing etc.). Eventually, at merge, we have to address the question of options not clashing (merge conflict). Ghaf configurations, release and debug, have some options that are used only for debug builds and default configuration values (and nix modules) are used only for release build. Likewise, there are debug/development specific nix modules to make development more easy - and to enable debug access.

  1. We create a module.

Explain the reasoning behind modules and how they fit our use case?

Partially explained above for Ghaf release and debug-configurations. In addition, we aim to use and put hardware specifics to nix hardware modules. In particular, @unbel13ver has implemented Ghaf iMX8 support that way and contributed that to the nix-hardware-modules https://github.com/NixOS/nixos-hardware/tree/master/nxp
The same mechanism is easily available to many other devices within that repo or outside it (e.g. Apple M1 support which I use) https://github.com/tpwrules/nixos-apple-silicon.

Similarily, we have a collection of Ghaf development modules which make sense only for debug/development builds. For example - the poke security holes like trivial login access via authentication.nix module which is essential to develop on device and is easy to drop for release builds. With release builds, one has no way to login as the authentication has not been set. Further release mode development aims to drop getty but that's another story.

We should evaluate these approaches, write them out and document a clean way to integrate and test kernel patch modules in Ghaf. Another question is, are this approaches overlapping or complimentary?

Complimentary. Depends on a specific case.

Kernel configurations

We propose that all kernel configurations are defined in the nix file that defines the module or the overlay.

Example:

kernel.override {
  kernelPatches = [
    # ...
  ];

  extraConfig = ''
    CONFIG_PCI_DEBUG=y
  '';
}

Device tree configurations

In current main branch some device tree configurations are added from a .dtb file in a target specific hardware module in modules/hardware.

How do we make this modular? Explore device tree overlays.

Depends... NVIDIA case depends not only on upstream Jetson BSP but also on OE4 and Anduril repackaging it downstream. We are only third downstream user - both getting the benefits of nixification of the BSP but also depending on upstreams doing certain things the way they do. This has worked so far both in terms us getting Jetson release updates in to use fairly fast and our developers contributing to above mentioned upstreams to fix issues we have found.

Depends... iMX8 hardware module is example where Ivan wrote it for reference and the same mechanism is good standard to follow when possible.

Depends... There are plenty other devices with nix-hardware-modules within that collection repo and outside it (M1). We get the benefits of just having capability to use the modules in Ghaf.

Modules for different targets

In the current main branch, target specific modules are mixed with common modules. This makes reading the repository somewhat challenging. Maybe we could move to something that separates architecture specific configurations more clearly.

ghaf/modules/
  target/
    nvidia-jetson-orin/
      bpmp-virt/            # Possibly a git submodule (separate repo)?
      ...
    generic-x86/
      ...
  common/
    module.nix
    ...

Open questions

  1. How to integrate a module?

Again, depends on a module. It may make sense to introduce an idea in a module in a PR to Ghaf. Or to nix-hardware-modules.

  1. How to integrate a module involving kernel patches and/or configurations?
    2.1 Do we use modules or overlays (in Nix context)?

The way do. Both can make sense. Case by case.

  1. How to integrate a module that involves device tree configurations?

Depends on how the BSP, including vendor bootloaders, support device trees. There is no one right answer as rewriting vendor BSP bootloaders may not be possible (e.g. secure boot support) or feasible (we could do it but we choose not to because we are not in the business of rewriting vendor bootloaders / device tree support).

  1. How to separate target specific modules?

Our ghaf target specific nix-files or nix-hardware-modules repo.

Related

@mikatammi
Copy link
Contributor

mikatammi commented May 20, 2023

I can comment on the part where I did my "device tree explorations".

An example of how device tree overlay can be created is here:
Branch: https://github.com/tiiuae/ghaf/tree/dtbo
Commit: f7c9f02

The NixOS's hardware.devicetree.overlay = [ ... ]; overlays are applied to all dtb-files.

However, this might not be sufficient, as .dtbo overlay's binary format cannot describe deletion of nodes or properties. Thus only new properties and nodes can be created in the overlays, and the existing ones can be overridden, but properties cannot be deleted with /delete-property/. If we need to use delete-property, then we can continue to patch the .dtbs in kernel, or we could define a derivation for producing the modified dtbs. The latter solution could be better because then we could separeate device tree build process from kernel build process, and we wouldn't need to spend 45 minutes building and testing the kernel change every time we want to introduce small change to dtb

@mikatammi
Copy link
Contributor

Today we worked with @juliuskoskela and I outlined an example approach to integrate bpmp-virt to ghaf in these branches:
https://github.com/mikatammi/bpmp-virt/tree/ghaf_integration
https://github.com/tiiuae/ghaf/tree/bpmp_virt_integration

Instead of overriding the kernel in a nixpkgs overlay, the NixOS module system is used to modify the kernel through boot.kernelPatches option

@jkuro-tii
Copy link
Contributor

jkuro-tii commented May 25, 2023

Please let me add my five cents.

My observation regarding the difference between modules and packages:

Module is a NixOS component, it's something bigger then a package. It's assumed to be used when building NixOS, bases on one or more packages.

Comparing to package, a module is responsible for:

  • implementing NixOS configuration options (e.g: the microvm module has to provide an action when the 'microvm.hypervisor = "qemu";' option is set)
  • changing configuration of other NixOS components
  • installing underlying packages - e.g. microvm optionally installs qemu, cloud-hypervisor and other packages.

Module's configuration is flexible, i.e.: whenever a module is used, different config options can be set. Package' build options are fixed and tighten to the nixpkgs instance. Changing package configuration requires creating new instance of nixpkgs with an overlay containing new config data for a package, or building a new package under other name.

Packages are not tied to NixOS. Usually it's a files tree ready to be copied into the root filesystem (/usr/bin, /usr/lib, etc.), and an installation script. They may be install by Nix environment users, on other then NixOS systems.
A package can be built as a flake overlay, thus can it be then easily used by others by adding the overlay to the nixpkgs.

Example: docker.
Exists both, module and a package. The package contains binaries, manuals and an installation script.
The module:

  • implements changing configuration of systemd to include docker in system services
  • sets network configuration
  • adds the the 'docker' group
  • installs docker package
  • ...

Example: microvm
Implemented as a module. Contains no binaries. Builds custom VMs configurations. User can build several virtual machines, define each separate configuration. According to the options used, microvm installs other components (e.g.: qemu, cloud-hyperviosor). Microvm module configures the NixOS and adds necessary files to run each VM in a handy way.

@juliuskoskela
Copy link
Contributor Author

Thank you @jkuro-tii for the great summary!

Module's configuration is flexible, i.e.: whenever a module is used, different config options can be set. Package' build options are fixed and tighten to the nixpkgs instance. Changing package configuration requires creating new instance of nixpkgs with an overlay containing new config data for a package, or building a new package under other name.

I think it bears to mention that packages can also be configured with overrides. However it's my understanding that it's actually an overlay under the hood.

Packages are not tied to NixOS. Usually it's a files tree ready to be copied into the root filesystem (/usr/bin, /usr/lib, etc.), and an installation script.

Aren't the files hashed and copied to /nix/store/?

@jkuro-tii
Copy link
Contributor

I think it bears to mention that packages can also be configured with overrides. However it's my understanding that it's actually an overlay under the hood.

Yes, overlays use overrides to modify existing package configuration. Overlays can be also used to add an additional package to new <nixpkgs> instance.

Packages are not tied to NixOS. Usually it's a files tree ready to be copied into the root filesystem (/usr/bin, /usr/lib, etc.), and an installation script.

Aren't the files hashed and copied to /nix/store/?

Yes, for storing purposes. What I meant is that the package content is copied or sym-linked into target systems' root filesystem during building or installation. Here's a content of the bash package:

/nix/store$ tree -hp ./dzrvibwj2vjwqmc34wk3x1ffsjpp4av7-bash-4.4-p23
[dr-xr-xr-x 12] ./dzrvibwj2vjwqmc34wk3x1ffsjpp4av7-bash-4.4-p23
├── [dr-xr-xr-x 12] bin
│   ├── [-r-xr-xr-x 899K] bash
│   └── [lrwxrwxrwx 4] sh -> bash
└── [dr-xr-xr-x 8] lib
└── [dr-xr-xr-x 276] bash
├── [-r-xr-xr-x 16K] basename
├── [-r-xr-xr-x 16K] dirname
├── [-r-xr-xr-x 17K] finfo
├── [-r-xr-xr-x 16K] head
├── [-r-xr-xr-x 16K] id
├── [-r-xr-xr-x 16K] ln
├── [-r-xr-xr-x 16K] logname
├── [-r-xr-xr-x 16K] mkdir
├── [-r-xr-xr-x 16K] mypid
├── [-r-xr-xr-x 16K] pathchk
├── [-r-xr-xr-x 16K] print
├── [-r-xr-xr-x 16K] printenv
├── [-r-xr-xr-x 16K] push
├── [-r-xr-xr-x 16K] realpath

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants