Note: This guide is only available for runtime-rs with default Hypervisor Dragonball. Now, other hypervisors are still ongoing, and it'll be updated when they're ready.
Currently, there is no widely applicable and convenient method available for users to use some kinds of backend storages, such as File on host based block volume, SPDK based volume or VFIO device based volume for Kata Containers, so we adopt Proposal: Direct Block Device Assignment to address it.
According to the proposal, it requires to use the kata-ctl direct-volume
command to add a direct assigned block volume device to the Kata Containers runtime.
And then with the help of method get_volume_mount_info, get information from JSON file: (mountinfo.json)
and parse them into structure Direct Volume Info which is used to save device-related information.
We only fill the mountinfo.json
, such as device
,volume_type
, fs_type
, metadata
and options
, which correspond to the fields in Direct Volume Info, to describe a device.
The JSON file mountinfo.json
placed in a sub-path /kubelet/kata-test-vol-001/volume001
which under fixed path /run/kata-containers/shared/direct-volumes/
.
And the full path looks like: /run/kata-containers/shared/direct-volumes/kubelet/kata-test-vol-001/volume001
, But for some security reasons. it is
encoded as /run/kata-containers/shared/direct-volumes/L2t1YmVsZXQva2F0YS10ZXN0LXZvbC0wMDEvdm9sdW1lMDAx
.
Finally, when running a Kata Containers with ctr run --mount type=X, src=Y, dst=Z,,options=rbind:rw
, the type=X
should be specified a proprietary type specifically designed for some kind of volume.
Now, supported types:
directvol
for direct volumevfiovol
for VFIO device based volumespdkvol
for SPDK/vhost-user based volume
Tips: raw block based backend storage MUST be formatted with
mkfs
.
$ sudo dd if=/dev/zero of=/tmp/stor/rawdisk01.20g bs=1M count=20480
$ sudo mkfs.ext4 /tmp/stor/rawdisk01.20g
{
"device": "/tmp/stor/rawdisk01.20g",
"volume_type": "directvol",
"fs_type": "ext4",
"metadata":"{}",
"options": []
}
$ sudo kata-ctl direct-volume add /kubelet/kata-direct-vol-002/directvol002 "{\"device\": \"/tmp/stor/rawdisk01.20g\", \"volume_type\": \"directvol\", \"fs_type\": \"ext4\", \"metadata\":"{}", \"options\": []}"
$# /kubelet/kata-direct-vol-002/directvol002 <==> /run/kata-containers/shared/direct-volumes/W1lMa2F0ZXQva2F0YS10a2F0DAxvbC0wMDEvdm9sdW1lMDAx
$ cat W1lMa2F0ZXQva2F0YS10a2F0DAxvbC0wMDEvdm9sdW1lMDAx/mountInfo.json
{"volume_type":"directvol","device":"/tmp/stor/rawdisk01.20g","fs_type":"ext4","metadata":{},"options":[]}
$ # type=disrectvol,src=/kubelet/kata-direct-vol-002/directvol002,dst=/disk002,options=rbind:rw
$ sudo ctr run -t --rm --runtime io.containerd.kata.v2 --mount type=directvol,src=/kubelet/kata-direct-vol-002/directvol002,dst=/disk002,options=rbind:rw "$image" kata-direct-vol-xx05302045 /bin/bash
Tip: It only supports
vfio-pci
based PCI device passthrough mode.
In this scenario, the device's host kernel driver will be replaced by vfio-pci
, and IOMMU group ID generated.
And either device's BDF or its VFIO IOMMU group ID in /dev/vfio/
is fine for "device" in mountinfo.json
.
$ lspci -nn -k -s 45:00.1
45:00.1 SCSI storage controller
...
Kernel driver in use: vfio-pci
...
$ ls /dev/vfio/110
/dev/vfio/110
$ ls /sys/kernel/iommu_groups/110/devices/
0000:45:00.1
First, configure the mountinfo.json
, as below:
- (1) device with
BB:DD:F
{
"device": "45:00.1",
"volume_type": "vfiovol",
"fs_type": "ext4",
"metadata":"{}",
"options": []
}
- (2) device with
DDDD:BB:DD:F
{
"device": "0000:45:00.1",
"volume_type": "vfiovol",
"fs_type": "ext4",
"metadata":"{}",
"options": []
}
- (3) device with
/dev/vfio/X
{
"device": "/dev/vfio/110",
"volume_type": "vfiovol",
"fs_type": "ext4",
"metadata":"{}",
"options": []
}
Second, run kata-containers with device(/dev/vfio/110
) as an example:
$ sudo kata-ctl direct-volume add /kubelet/kata-vfio-vol-003/vfiovol003 "{\"device\": \"/dev/vfio/110\", \"volume_type\": \"vfiovol\", \"fs_type\": \"ext4\", \"metadata\":"{}", \"options\": []}"
$ # /kubelet/kata-vfio-vol-003/directvol003 <==> /run/kata-containers/shared/direct-volumes/F0va22F0ZvaS12F0YS10a2F0DAxvbC0F0ZXvdm9sdF0Z0YSx
$ cat F0va22F0ZvaS12F0YS10a2F0DAxvbC0F0ZXvdm9sdF0Z0YSx/mountInfo.json
{"volume_type":"vfiovol","device":"/dev/vfio/110","fs_type":"ext4","metadata":{},"options":[]}
$ # type=disrectvol,src=/kubelet/kata-vfio-vol-003/vfiovol003,dst=/disk003,options=rbind:rw
$ sudo ctr run -t --rm --runtime io.containerd.kata.v2 --mount type=vfiovol,src=/kubelet/kata-vfio-vol-003/vfiovol003,dst=/disk003,options=rbind:rw "$image" kata-vfio-vol-xx05302245 /bin/bash
SPDK vhost-user devices in runtime-rs, unlike runtime (golang version), there is no need to mknod
device node under /dev/
any more.
Just using the kata-ctl direct-volume add ..
to make a mount info config is enough.
Run a SPDK vhost target and get vhost-user block controller as an example:
First, run SPDK vhost target:
Tips: If driver
vfio-pci
supported, you can run SPDK withDRIVER_OVERRIDE=vfio-pci
Otherwise, Just run without itsudo HUGEMEM=4096 ./scripts/setup.sh
.
$ SPDK_DEVEL=/xx/spdk
$ VHU_UDS_PATH=/tmp/vhu-targets
$ RAW_DISKS=/xx/rawdisks
$ # Reset first
$ ${SPDK_DEVEL}/scripts/setup.sh reset
$ sudo sysctl -w vm.nr_hugepages=2048
$ #4G Huge Memory for spdk
$ sudo HUGEMEM=4096 DRIVER_OVERRIDE=vfio-pci ${SPDK_DEVEL}/scripts/setup.sh
$ sudo ${SPDK_DEVEL}/build/bin/spdk_tgt -S $VHU_UDS_PATH -s 1024 -m 0x3 &
Second, create a vhost controller:
$ sudo dd if=/dev/zero of=${RAW_DISKS}/rawdisk01.20g bs=1M count=20480
$ sudo ${SPDK_DEVEL}/scripts/rpc.py bdev_aio_create ${RAW_DISKS}/rawdisk01.20g vhu-rawdisk01.20g 512
$ sudo ${SPDK_DEVEL}/scripts/rpc.py vhost_create_blk_controller vhost-blk-rawdisk01.sock vhu-rawdisk01.20g
Here, a vhost controller vhost-blk-rawdisk01.sock
is created, and the controller will
be passed to Hypervisor, such as Dragonball, Cloud-Hypervisor, Firecracker or QEMU.
First, mkdir
a sub-path kubelet/kata-test-vol-001/
under /run/kata-containers/shared/direct-volumes/
.
Second, fill fields in mountinfo.json
, it looks like as below:
{
"device": "/tmp/vhu-targets/vhost-blk-rawdisk01.sock",
"volume_type": "spdkvol",
"fs_type": "ext4",
"metadata":"{}",
"options": []
}
Third, with the help of kata-ctl direct-volume
to add block device to generate mountinfo.json
, and run a kata container with --mount
.
$ # kata-ctl direct-volume add
$ sudo kata-ctl direct-volume add /kubelet/kata-test-vol-001/volume001 "{\"device\": \"/tmp/vhu-targets/vhost-blk-rawdisk01.sock\", \"volume_type\":\"spdkvol\", \"fs_type\": \"ext4\", \"metadata\":"{}", \"options\": []}"
$ # /kubelet/kata-test-vol-001/volume001 <==> /run/kata-containers/shared/direct-volumes/L2t1YmVsZXQva2F0YS10ZXN0LXZvbC0wMDEvdm9sdW1lMDAx
$ cat L2t1YmVsZXQva2F0YS10ZXN0LXZvbC0wMDEvdm9sdW1lMDAx/mountInfo.json
$ {"volume_type":"spdkvol","device":"/tmp/vhu-targets/vhost-blk-rawdisk01.sock","fs_type":"ext4","metadata":{},"options":[]}
As /run/kata-containers/shared/direct-volumes/
is a fixed path , we will be able to run a kata pod with --mount
and set
src
sub-path. And the --mount
argument looks like: --mount type=spdkvol,src=/kubelet/kata-test-vol-001/volume001,dst=/disk001
.
In the case, ctr run --mount type=X, src=source, dst=dest
, the X will be set spdkvol
which is a proprietary type specifically designed for SPDK volumes.
$ # ctr run with --mount type=spdkvol,src=/kubelet/kata-test-vol-001/volume001,dst=/disk001
$ sudo ctr run -t --rm --runtime io.containerd.kata.v2 --mount type=spdkvol,src=/kubelet/kata-test-vol-001/volume001,dst=/disk001,options=rbind:rw "$image" kata-spdk-vol-xx0530 /bin/bash