Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cpu pin docs #227

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

add cpu pin docs #227

wants to merge 1 commit into from

Conversation

gxglls
Copy link
Contributor

@gxglls gxglls commented Dec 25, 2023

No description provided.

@gxglls gxglls marked this pull request as draft December 25, 2023 04:02
@gxglls gxglls force-pushed the cpu-pin branch 2 times, most recently from 895eafc to 1e7ef33 Compare December 25, 2023 08:48
docs/deploy-iomesh-cluster/performance-tips.md Outdated Show resolved Hide resolved

IOMesh 属于性能敏感型应用。在默认情况下,IOMesh Pod 运行时不会绑定在任何 cpu 核心上,这种情况下虽然 IOMesh 可以正常运行,但频繁的 cpu 切换和 cpu cache missing 会导致 IOMesh 性能波动较大,性能上限不理想。

为解决上述问题,IOMesh 提供了 "基于 Kubelet CpuManager" 和 "基于 Kernel parameter" 两种 cpu 核心绑定模式,并且使所绑定的 cpu 能够被 IOMesh 独占。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前没有看出来什么情况选择第一种,什么情况选择第二种方案

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原理和背景这些我后续再完善下

docs/deploy-iomesh-cluster/performance-tips.md Outdated Show resolved Hide resolved
在 chunk pod 所在的 k8s worker 上执行如下命令验证 meta 服务绑定到了 1 个 cpu 上,以下命令输出代表 chunk 服务成功绑定到了 cpu 0,1,2 上。如果绑定未生效,则 cpuset.cpus 的值为空或该文件不存在

```shell
# cat /sys/fs/cgroup/cpuset/zbs/chunk-io/cpuset.cpus

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

下面三个命令,是在一个chunk节点所在的worker上执行的吗?这三个命令返回3个不同的cpu?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对的,“在 chunk pod 所在的 k8s worker 上执行”

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

假如chunk是3个节点,我原来理解需要到每个chunk的节点执行一次,分别看用的哪个cpu。那在一个节点执行看见的这3个cpu是代表这3个节点的独占的cpu吧?那如果是5个chunk节点,也是查这3个文件吗?得能看见5个cpu吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5个chunk节点,每个节点看到的结果都是一样的,都是这三个文件的3个cpu


## 基于 Kernel parameter 的 cpu 核心绑定配置方式

### 配置内核 cpu 隔离

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块也得提示用户NUMA的问题吧,不能设置的跨NUMA,要不然设置不成功

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,后续会补充


在 grub 配置文件中添加`isolcpus`内核启动参数`GRUB_CMDLINE_LINUX_DEFAULT`,其值标识要隔离的 cpu,配置格式可参考 https://man7.org/linux/man-pages/man7/cpuset.7.html#FORMATS。配置文件路径为 /etc/default/grub

默认情况下,IOMesh 需要使用 4 个独占的 cpu 核心(其中 chunk 服务独占 3 cpu,meta 服务独占 1 cpu)。在以下示例中,我们假设系统总共有 16 个 CPU 核心,第 0,1,2,10 号 CPU 核心专用于 IOMesh,在 GRUB_CMDLINE_LINUX_DEFAULT 后追加如下配置

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

提供修改方式


> _NOTE:_
> 此步骤的操作方法可能会有所不同,具体取决于当前使用的 Linux 发行版

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

也得加一句:
依次对所有 K8s Worker 节点执行如下操作:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants