前置:

1.保证ip a命令能够查看到ib相关的接口,如果不行先检查lspci是否识别到ib卡,然后按照第二步操作加载ib相关内核模块。

2.配置IPoIB,可以使用nmtui快速配置

基于飞牛os的nfs rdma实施:

无需安装驱动,debian12默认加载了ib相关内核模块:

modprobe svcrdmaxprtrdma

用于nfs over rdma

lsmod检查是否已经加载相关模块,把模块写入自动开机自动加载(客户端及服务端):

vim /etc/modules-load.d/modules.conf

rdma_cm
svcrdma
xprtrdma
ib_ipoib
rpcrdma
mlx4_ib
iw_cm
ib_uverbs
ib_cm

把默认源替换为阿里源:

修改/etc/apt/sources.list

deb https://mirrors.aliyun.com/debian/ bullseye main non-free contrib
deb-src https://mirrors.aliyun.com/debian/ bullseye main non-free contrib
deb https://mirrors.aliyun.com/debian-security/ bullseye-security main
deb-src https://mirrors.aliyun.com/debian-security/ bullseye-security main
deb https://mirrors.aliyun.com/debian/ bullseye-updates main non-free contrib
deb-src https://mirrors.aliyun.com/debian/ bullseye-updates main non-free contrib
#deb https://mirrors.aliyun.com/debian/ bullseye-backports main non-free contrib
#deb-src https://mirrors.aliyun.com/debian/ bullseye-backports main non-free contrib

在主节点安装子网管理器:

apt update

apt install opensm

sudo systemctl start opensm
sudo systemctl enable opensm # 开机自启

主节点安装完之后,两台机器重启

测试ib通信:

# 在服务端
ib_send_bw -d mlx4_0

# 在客户端
ib_send_bw -d mlx4_0 <server_IB_IP_or_GID>

编辑/etc/nfs.conf添加:

[nfsd]
rdma=20049

重启nfs

sudo systemctl restart nfs-server

客户端

客户端加载模块:

sudo modprobe rdma_cm
sudo modprobe xprtrdma

写入fstab挂载:

10.18.1.2:/fs/1000/nfs  /nfs  nfs  rdma,port=20049,vers=4.2,proto=rdma,timeo=600,retrans=2  0  0

查看挂载状态

nfsstat -m [本地挂载点]