Use InfiniBand RDMA split storage & compute

简介:
RDMA是一种特殊的协议,主要用于IB网络,高吞吐量,低延迟的远程存储访问。并且不需要耗费CPU时间,不需要在应用内存和内核内存直接拷贝数据。
一般可被用于大型的网络存储访问。或者有利于将计算资源和存储资源分离,使用RDMA提供低延迟和高吞吐量, 降低CPU开销。
下面是关于drma的介绍。
In  computing remote direct memory access  ( RDMA ) is a  direct memory access  from the  memory  of one computer into that of another without involving either one's  operating system . This permits high-throughput, low- latency  networking, which is especially useful in massively parallel  computer clusters .

RDMA supports zero-copy networking by enabling the network adapter to transfer data directly to or from application memory, eliminating the need to copy data between application memory and the data buffers in the operating system. Such transfers require no work to be done by CPUscaches, or context switches, and transfers continue in parallel with other system operations. When an application performs an RDMA Read or Write request, the application data is delivered directly to the network, reducing latency and enabling fast message transfer.

However, this strategy presents several problems related to the fact that the target node is not notified of the completion of the request (1-sided communications).

Much like other high performance computing (HPC) interconnects, RDMA has achieved limited acceptance as of 2013 due to the need to install a different networking infrastructure. However, new standards[specify] enable Ethernet RDMA implementation at the physical layer using TCP/IP as the transport, thus combining the performance and latency advantages of RDMA with a low-cost, standards-based solution.[citation needed] The RDMA Consortium and the DAT Collaborative[1] have played key roles in the development of RDMA protocols and APIs for consideration by standards groups such as the Internet Engineering Task Force and the Interconnect Software Consortium.[2]

Hardware vendors have started working on higher-capacity RDMA-based network adapters, with rates of 40Gbit/s reported.[3][4] Software vendors such as Red Hat and Oracle Corporation support these APIs in their latest products, and as of 2013 engineers have started developing network adapters that implement RDMA over Ethernet. Both Red Hat Enterprise Linux and Red Hat Enterprise MRG[5] have support for RDMA. Microsoft supports RDMA in Windows Server 2012 via SMB Direct.

Common RDMA implementations include the Virtual Interface ArchitectureRDMA over Converged Ethernet (RoCE),[6] InfiniBand, and iWARP.

如果你的系统中有支持RDMA的适配器,你可以尝试使用它,例如从Red Hat 6开始支持NFS over RDMA。
这是一种对应用透明的远程存储访问方法。
配置方法也很简单。

9.7.5. NFS over RDMA

To enable the RDMA transport in the linux kernel NFS server, use the following procedure:
?

Procedure 9.2. Enable RDMA from server

  1. Ensure the RDMA rpm is installed and the RDMA service is enabled with the following command:
    # yum install rdma; chkconfig --level 2345 rdma on
  2. Ensure the package that provides the nfs-rdma service is installed and the service is enabled with the following command:
    # yum install rdma; chkconfig --level 345 nfs-rdma on
  3. Ensure that the RDMA port is set to the preferred port (default for Red Hat Enterprise Linux 6 is 2050). To do so, edit the  /etc/rdma/rdma.conf file to set NFSoRDMA_LOAD=yes and NFSoRDMA_PORT to the desired port.
  4. Set up the exported filesystem as normal for NFS mounts.
On the client side, use the following procedure:
?

Procedure 9.3. Enable RDMA from client

  1. Ensure the RDMA rpm is installed and the RDMA service is enabled with the following command:
    # yum install rdma; chkconfig --level 2345 rdma on
  2. Mount the NFS exported partition using the RDMA option on the mount call. The port option can optionally be added to the call.
    # mount -t nfs -o rdma,port=port_number
还有一种方法是使用SDP,也是对应用透明的。
可参考

[参考]
目录
相关文章
|
6月前
|
人工智能 缓存 调度
技术改变AI发展:RDMA能优化吗?GDR性能提升方案(GPU底层技术系列二)
随着人工智能(AI)的迅速发展,越来越多的应用需要巨大的GPU计算资源。GPUDirect RDMA 是 Kepler 级 GPU 和 CUDA 5.0 中引入的一项技术,可以让使用pcie标准的gpu和第三方设备进行直接的数据交换,而不涉及CPU。
136055 6
|
Linux Anolis 异构计算
关于远程直接内存访问技术 RDMA 的高性能架构设计介绍
本文介绍 RDMA 技术的基本原理及交流在工程上的设计思路。
|
机器学习/深度学习 网络协议 异构计算
浅析GPU通信技术(下)-GPUDirect RDMA
目录 浅析GPU通信技术(上)-GPUDirect P2P 浅析GPU通信技术(中)-NVLink 浅析GPU通信技术(下)-GPUDirect RDMA 1. 背景         前两篇文章我们介绍的GPUDirect P2P和NVLink技术可以大大提升GPU服务器单机的GPU通信性...
26258 0
|
2月前
|
存储 机器学习/深度学习 并行计算
GPU通信互联技术:GPUDirect、NVLink与RDMA
在高性能计算和深度学习领域,GPU已成为关键工具。然而,随着模型复杂度和数据量的增加,单个GPU难以满足需求,多GPU甚至多服务器协同工作成为常态。本文探讨了三种主要的GPU通信互联技术:GPUDirect、NVLink和RDMA。GPUDirect通过绕过CPU实现GPU与设备直接通信;NVLink提供高速点对点连接和支持内存共享;RDMA则在网络层面实现直接内存访问,降低延迟。这些技术各有优势,适用于不同场景,为AI和高性能计算提供了强大支持。
|
3月前
|
SQL 存储 分布式计算
神龙大数据加速引擎MRACC问题之RDMA技术帮助大数据分布式计算优化如何解决
神龙大数据加速引擎MRACC问题之RDMA技术帮助大数据分布式计算优化如何解决
59 0
|
缓存 人工智能 算法
Nvidia_Mellanox_CX5和6DX系列网卡_RDMA_RoCE_无损和有损_DCQCN拥塞控制等技术简介-一文入门RDMA和RoCE有损无损
Nvidia_Mellanox_CX5和6DX系列网卡_RDMA_RoCE_无损和有损_DCQCN拥塞控制等技术简介-一文入门RDMA和RoCE有损无损
1328 0
|
6月前
|
人工智能 弹性计算 缓存
带你读《弹性计算技术指导及场景应用》——2. 技术改变AI发展:RDMA能优化吗?GDR性能提升方案
带你读《弹性计算技术指导及场景应用》——2. 技术改变AI发展:RDMA能优化吗?GDR性能提升方案
218 1
|
弹性计算 人工智能 网络协议
揭秘!CIPU最新秘密武器–弹性RDMA的技术解析与实践
弹性RDMA(Elastic Remote Direct Memory Access,简称eRDMA),是阿里云自研的云上弹性RDMA网络,底层链路复用VPC网络,采用全栈自研的拥塞控制CC(Congestion Control )算法,兼具传统RDMA网络高吞吐、低延迟特性,同时支持秒级的大规模RDMA组网。基于弹性RDMA,开发者可以将HPC应用软件部署在云上,获取成本更低、弹性更好的高性能应用集群;也可以将VPC网络替换成弹性RDMA网络,加速应用性能。
揭秘!CIPU最新秘密武器–弹性RDMA的技术解析与实践
|
弹性计算 人工智能 算法
阿里云徐成:CIPU最新秘密武器-弹性RDMA的技术解析与实践|阿里云弹性计算技术公开课直播预告
弹性RDMA(Elastic Remote Direct Memory Access,简称eRDMA),是阿里云自研的云上弹性RDMA网络,底层链路复用VPC网络,采用全栈自研的拥塞控制CC(Congestion Control )算法,兼具传统RDMA网络高吞吐、低延迟特性,同时支持秒级的大规模RDMA组网。基于弹性RDMA,开发者可以将HPC应用软件部署在云上,获取成本更低、弹性更好的高性能应用集群;也可以将VPC网络替换成弹性RDMA网络,加速应用性能。

热门文章

最新文章