Part of the Technology photoes in this website are created by rawpixel.com - www.freepik.com

NVMe MR-IOV – High-Performance Storage Solution for Virtual Environment Deployments.

10369

We have discussed some benefits of multi-host NVMe SR-IOV, or multi-root SR-IOV (MR-IOV) last time, the solution aims to improve SSD performance under virtual environments while ensuring high utilization and flexibility for the storage resources. In this blog, let’s take a closer look at the performance aspect, see how exactly our proposed NVMe MR-IOV delivers high-performance in virtual environments.


Interface:

The Most common interface for disks today is probably SATA, but NVMe SSDs that leverage PCIe interface are emerging and is also quite popular today. The most obvious difference between the two interface is the throughput. The latest PCIe generation PCIe 4.0 reaches 2 GB/s per lane, and a lot of enterprise SSDs out there has PCIe x8 interface, which has 16GB/s bandwidth, whereas SATA 3.0 only reaches 6 Gb/s max. Another key difference is the latency. NVMe drives that transfer data directly through PCIe have a latency of just a few microseconds, whereas SATA SSDs have a latency in the 30 to 100 microseconds range. Clearly, for those who are looking for high performance, PCIe NVMe SSDs would be a better choice.


Protocol:

NVM Express, often refer to as NVMe, is a protocol standard for host to communicate with a non-volatile memory device over PCIe directly, thus it is inherently parallel and high performing. As PCIe advances, the performance that NVMe drives can deliver would become even better. The other common protocol used is AHCI, which was originally designed for HDDs and was brought over to SATA SSDs. Table 1 shows the comparison between these two standards.


Table 1.


Retrieved from Phison Blog. https://phisonblog.com/ahci-vs-nvme-the-future-of-ssds/

The MR-IOV solution that we proposed adopts PCIe switch technology, so even though the SSDs are disaggregated from the hosts, the data travels through PCIe all the way from host CPU to the SSDs. This approach reduces overheads due to extra data encoding/decoding process and it should fulfill the performance need at the bare metal level. Next, we will discuss how our solution secures performance on virtual machines.


SR-IOV:

There are different virtualization approaches for SSDs. There are software approaches that emulates storage disks for virtual machines to access SSDs, but they often create appreciable software overheads that limits the disk drives from their best performance. In contrast, SR-IOV is more of a hardware approach, it is achieved by the NVMe controller built into the disk drive, thus eliminates much of that software overheads. Following diagram illustrates how virtual machines could possibly interact with NVMe storage devices.



In classical virtual environment, when virtual machines (guest OS) are to communicate with the disks, say retrieving data from it, it must go through the hypervisor. The hypervisor handles all the interruptions coming back and forth, it maps the guest OS and the physical device so that the data travels the right route and lands to the right place. As a result, it consumes much CPU resources and the performance drops.

When SR-IOV is enabled, users can create multiple virtual functions (VF) that are associated to one physical function (PF), you can think of it as creating multiple clones from one SSD, so that the physical SSD can be accessed by different machines at the same time, sharing the bandwidth of the SSD. The specification of SR-IOV is developed and maintained by PCI-SIG, you could visit their site for more details.

These SSD virtual functions are the key that drives improved performance in VMs. Unlike the physical function that contains full PCIe functions, a virtual function is a lightweight PCIe device, and its purpose is only to allow data movement, therefore users would not be able to do much configuration from a VF. A VF only contains the resources given to it by the PF.

When the VFs are attached to a virtual machine, the virtual machine would recognize it as an actual PCIe device that is directly attached. As the PF already mapped certain resources to VFs, the virtual machine can now access these resources through a direct I/O path that is introduced by the VF. In addition, the interruptions that are previously handled by the hypervisor is now passed to individual virtual machines. In other words, the data goes directly to the virtual machine without having to go through hypervisor or any other layers that would otherwise increase wait time.


Falcon NVMe MR-IOV:

Our proposed MR-IOV solution extends the application of SR-IOV. It allows not only VMs, but VMs on different host machines to share a physical SSD by enabling SR-IOV independent to the host machines. Following diagram illustrates the architecture of our solution.



We also did a quick fio tests on virtual machine using Falcon NVMe chassis and Samsung PM1735 NVMe SSDs. In the test, we assigned a VF with 3 TB capacity to the host, then passed it through to the virtual machine (Ubuntu 20.04). Table 2 shows the environment used and the fio bandwidth result.


Table 2.

From the test we found that the bandwidth performance is over 8GB/s consistently. This number matches the official spec of Samsung PM1735 NVMe SSD, which suggests that the performance is not affected by SSD externality in the context of using our proposed architecture, the VF performs as if it is a real NVMe device directly attached to the VM.

Following images are captured from the testing screen.

Guest OS


Samsung PM1735 virtual function


Fio test. Parameters and result.









修正document.querySelector('link[rel="canonical"]').href = url_now; setCanonical('https://www.h3platform.com/blog-detail/' + reserved_para); } if (blogNum == "0") { if (para_id == "26") { setTD("NVMe MR-IOV - Lower TCO of IT System|H3 Platform", " Falcon 5208 NVMe MR-IOV solution ensures SSD performance and flexibility,. With built-in PCIe fabric, it requires less hardware to achieve high-performance storage service in comparison to other NVMe-oF solutions. An MR-IOV solution also allows better utilization of expensive CPUs especially in virtual environments."); } else if (para_id == "29") { setTD("【CXL Storage】 CXL 2.0 / PCIe Gen 5 - The Future of Composable Infrastructure|H3 Platform", "H3 Platform has NVMe MR-IOV solution, increasing storage utilization. SR-IOV of the NVMe SSDs is enabled in the NVMe chassis. CXL device are general-purpose accelerators such as NIC and GPU. CXL specification is based on PCIe Gen 5, and CXL allows CPU to access shared memory on accelerator devices. Nowadays, CXL 2.0 introduces pooling capability to the CXL protocol, improving the composability of memory."); } else if (para_id == "30") { setTD("【PCIe Expansion Chassis】– Big Accelerator Memory-Enhancing GPU and Storage Efficiency with PCIe Expansion Solution|H3 Platform", "Nvidia recently released a report on the effectiveness of Big Accelerator Memory (BaM) architecture. BaM leverages GPUDirect RDMA, allowing GPU thread to communicate with SSDs using NVMe queues to ultimately reduce reliance on CPU."); } else if (para_id == "36") { setTD("【CXL memory expansion】– Memory Expansion for Breakthrough Performance|H3 Platform", "CXL memory have been widely discussed for its capability to enhance memory bandwidth and capacity, and these benefits are significant to the emerging AI/ML applications. "); } else if (para_id == "40") { setTD("Toward PCIe Gen 5 Composable Infrastructure as a Service|H3 Platform", "The two case examples above indicate H3's capability to realize device pooling potential and expand resource configuration flexibility. That might be why SC 22 invites H3 to share experiences in the panel session. H3 is ready for everything @SC22. We look forward to displaying H3's avant-garde PCIe Gen 5 CIaaS worldwide."); } else { setTD(strT); } setInternalLink(document.querySelector("div.editor-content"), { href: "/product-list/10", anchor: array_gpuchassis[urlID % array_gpuchassis.length] }, { href: "/product", anchor: array_product[urlID % array_product.length] }); setArticleSchema(); document.querySelectorAll("ul.breadcrumb a")[1].href = "https://www.h3platform.com/blog-list?category=10"; document.querySelectorAll("ul.breadcrumb a")[2].innerHTML = document.querySelector(".title-container h1").innerText; document.querySelectorAll("ul.breadcrumb a")[2].href = url_now; document.querySelectorAll("ul.breadcrumb a")[2].style.color = "#808285"; } else if (blogNum == "1") { if (para_id == "24") { setTD("Increase the Efficiency of Storage System with Multi-host NVMe SR-IOV solution|H3 Platform", "NVMe SR-IOV is the solution for NVMe SSD sharing the resource among multiple servers often limits SSD’s performance as the networking creates I/O bottleneck."); } else if (para_id == "25") { setTD("NVMe MR-IOV – High-Performance Storage Solution for Virtual Environment Deployments|H3 Platform", "Multi-host NVMe SR-IOV, or multi-root SR-IOV (MR-IOV) is the solution aims to improve SSD performance under virtual environments while ensuring high utilization and flexibility for the storage resources. H3 Platform's proposed MR-IOV solution extends the application of SR-IOV."); } else if (para_id == "50") { setTD("【PCIe Gen 5 NVMe chassis】PCIe Gen 5 NVMe MRIOV Solution for Storage Scalability|H3 Platform", "NVMe, a new generation of high-speed storage interface, has higher bandwidth and lower latency than the traditional SATA interface. NVMe Multi-Root IO Virtualization technology (NVMe MR-IOV) further scales up the NVMe resources to realize mass storage sharing and virtualization by allowing multiple virtual machines to visit the same pool of NVMe devices at the same time."); } else { setTD(strT); } setInternalLink(document.querySelector("div.editor-content"), { href: "/product-list/17", anchor: "NVMe MR-IOV Solution" }, { href: "/product", anchor: "Composable NVMe SSD" }); setArticleSchema(); document.querySelectorAll("ul.breadcrumb a")[1].href = "https://www.h3platform.com/blog-list?category=11"; document.querySelectorAll("ul.breadcrumb a")[2].innerHTML = document.querySelector(".title-container h1").innerText; document.querySelectorAll("ul.breadcrumb a")[2].href = url_now; document.querySelectorAll("ul.breadcrumb a")[2].style.color = "#808285"; } else if (blogNum == "2") { setTD(strT); setInternalLink(document.querySelector("div.editor-content"), { href: "/product-list/17", anchor: "NVMe MR-IOV Solution" }, { href: "/product", anchor: "Composable NVMe SSD" }); setArticleSchema(); document.querySelectorAll("ul.breadcrumb a")[1].href = "https://www.h3platform.com/blog-list?category=12"; document.querySelectorAll("ul.breadcrumb a")[2].innerHTML = document.querySelector(".title-container h1").innerText; document.querySelectorAll("ul.breadcrumb a")[2].href = url_now; document.querySelectorAll("ul.breadcrumb a")[2].style.color = "#808285"; } else if (blogNum == "3") { if (para_id == "73") { setTD("Composable Memory System: 210M IOPS, Reduce Bottlenecks|H3 Platform", "Composable memory systems deliver up to 210 million IOPS and remove memory bottlenecks using CXL. Features include dynamic memory pooling, real-time allocation, and improved resource use—helping data centers scale faster while reducing TCO."); } else if (para_id == "72") { setTD("CXL 2.0 Memory Pooling Breakthrough|Four Servers Sharing 2TB Achieve 210M IOPS and 120GB/s Bandwidth|H3 Platform", "Discover H3 Platform's latest advancement in CXL 2.0 memory pooling and memory sharing technology, enabling four servers to share 2TB of memory. Key highlights include achieving 210 million IOPS and 120GB/s bandwidth, significantly enhancing data access speeds and system performance. Explore the detailed test environment, methodologies, and results that showcase this innovative leap in server memory management."); } else if (para_id == "68") { setTD("What is CXL Memory Sharing? Unlocking Shared Memory for AI and HPC|H3 Platform", "Learn how CXL memory sharing is revolutionizing computing with enhanced scalability and efficiency. This blog dives into CXL shared memory, its applications in AI and HPC, and how it transforms disaggregated memory architecture. Explore CXL technologies, protocols, and their role in creating resilient memory management systems for distributed environments. Discover why CXL memory is the future of high-performance computing and data processing."); document.querySelector("main#blog-content img.cover").alt = document.querySelector("div.title-container h1").textContent; } else { setTD(strT); } setInternalLink(document.querySelector("div.editor-content"), { href: "/product-list/18", anchor: "CXL Memory Pooling Solution" }, { href: "/blog-detail/68", anchor: "CXL Memory Sharing Architecture" }); setArticleSchema(); document.querySelectorAll("ul.breadcrumb a")[1].href = "https://www.h3platform.com/blog-list?category=14"; document.querySelectorAll("ul.breadcrumb a")[2].innerHTML = document.querySelector(".title-container h1").innerText; document.querySelectorAll("ul.breadcrumb a")[2].href = url_now; document.querySelectorAll("ul.breadcrumb a")[2].style.color = "#808285"; } else if (blogNum == "4") { // 2025-1208 setTD(strT); /* setInternalLink(document.querySelector("div.editor-content"), { href: "/blog-detail/77", anchor: "AI Storage Fundamentals" }); */ setArticleSchema(); document.querySelectorAll("ul.breadcrumb a")[1].href = "https://www.h3platform.com/blog-list?category=15"; document.querySelectorAll("ul.breadcrumb a")[2].innerHTML = document.querySelector(".title-container h1").innerText; document.querySelectorAll("ul.breadcrumb a")[2].href = url_now; document.querySelectorAll("ul.breadcrumb a")[2].style.color = "#808285"; if (para_id == "77") { setFAQSchema(); } } var breads = [{ href: "/", anchor: "H3 Platform" }, { href: "/blog-list", anchor: "Blog" }, { href: url_now, anchor: document.querySelector(".title-container h1").innerText }]; setBreadCrumbSchema(breads); setSocialMediaMeta({ cond: "meta[property='og:title']", cont: strT }, { cond: "meta[property='og:url']", cont: url_now }, { cond: "meta[property='og:description']", cont: strD }); createTag("meta", { name: "thumbnail", content: document.querySelector("img.cover").src }); function checkData(obj) { for (var i = 0; i < obj.group.length; i++) { if (obj.group[i].blogID.includes(para_id)) { return i; } } } function setTD() { var metaTitle = document.querySelector("title"); var metaDes = document.querySelector("meta[name='description']"); if (arguments.length > 1) { if (!metaDes) { var des = document.createElement("meta"); des.name = "description"; document.getElementsByTagName("head")[0].appendChild(des); des.content = arguments[1]; } else { metaDes.content = arguments[1]; } metaTitle.innerHTML = arguments[0]; } else { metaTitle.innerHTML = arguments[0]; } } function createDetailContent(target, id, content) { var real_id = "jsContent" + id; target.innerHTML = '' + target.textContent + ''; var tag_article = document.createElement("article"); tag_article.style.display = "none"; tag_article.style.textAlign = "center"; tag_article.style.marginBottom = "1em"; tag_article.id = real_id; tag_article.innerHTML = content; target.parentNode.insertBefore(tag_article, target.nextElementSibling); } function show(id) { var t = document.querySelector("article#" + id); t.style.display = (t.style.display == "none") ? "" : "none"; } function addSchema(schema) { var scriptJSON = document.createElement("script"); scriptJSON.type = 'application/ld+json'; scriptJSON.innerHTML = JSON.stringify(schema); document.getElementsByTagName("head")[0].appendChild(scriptJSON); } function extend(obj, src) { for (var key in src) { if (src.hasOwnProperty(key)) obj[key] = src[key]; } } function setBreadCrumbSchema(breadContent) { var schemaData_bread = { "@context": "http://schema.org", "@type": "BreadcrumbList", "itemListElement": [] }; var itemListElement = []; for (var i = 0; i < breadContent.length; i++) { var item = { "@type": "ListItem", "position": i + 1, "item": { "@id": breadContent[i].href, "name": breadContent[i].anchor } }; itemListElement.push(item); } extend(schemaData_bread.itemListElement, itemListElement); addSchema(schemaData_bread); } function setSocialMediaMeta() { for (var i = 0; i < arguments.length; i++) { document.querySelector(arguments[i].cond).content = arguments[i].cont; } } function createTag(tagName) { var tag_head = document.getElementsByTagName("head")[0]; var tag = document.createElement(tagName); for (var i = 1; i < arguments.length; i++) { for (attr in arguments[i]) { tag.setAttribute(attr, arguments[i][attr]); } } tag_head.appendChild(tag); } function setInternalLink(target) { var tagDiv = document.createElement("div"); tagDiv.style.marginTop = "2.5em"; tagDiv.style.textAlign = "left"; tagDiv.style.color = "#231F20"; var strLink = ""; for (var i = 1; i < arguments.length; i++) { strLink += '' + arguments[i].anchor + '|'; } tagDiv.innerHTML = 'Product Info:' + strLink.substring(0, strLink.length - 1); target.appendChild(tagDiv); } function count_url(url) { var url_to_id = 0; for (var i = 0; i < url.length; i++) { url_to_id += url.charCodeAt(i); } return url_to_id; } function getParameter(name, url) { name = name.replace(/[\[\]]/g, "\\$&"); var regex = new RegExp("[?&]" + name + "(=([^&#]*)|&|#|$)"); var results = regex.exec(url); if (!results) { return null; } if (!results[2]) { return ' '; } return decodeURIComponent(results[2].replace(/\+/g, " ")); } function setCanonical(url_path){ var canonical_check = document.querySelector("link[rel=canonical]"); if(!canonical_check){ var link_seo = document.createElement("link"); link_seo.rel = "canonical"; link_seo.href = url_path; var head_place = document.getElementsByTagName("head")[0]; head_place.appendChild(link_seo); } else{ canonical_check.href = url_path; } } function setArticleSchema() { var imgElem = document.querySelector("div.editor-content img"); var imgUrl = imgElem ? imgElem.src : "https://www.h3platform.com/img/blog/blog-banner.jpg"; var timeText = document.querySelector("time").textContent.trim(); var PublishDate = formatDateToISO(timeText); var schemaData_Article = { "@context": "https://schema.org", "@type": "Article", "headline": document.querySelector(".title-container h1").innerText, "image": imgUrl, "datePublished": PublishDate, "author": { "@type": "Organization", "name": "H3 Platform", "url": "https://www.h3platform.com/about" }, "publisher": { "@type": "Organization", "name": "H3 Platform", "logo": { "@type": "ImageObject", "url": "https://www.h3platform.com/img/logo.png" } }, "description": document.querySelector("div.editor-content").innerText.substring(0, 300) + " ..." }; addSchema(schemaData_Article); } function formatDateToISO(timeText) { var dateObj = new Date(timeText); var yyyy = dateObj.getFullYear(); var mm = String(dateObj.getMonth() + 1).padStart(2, '0'); var dd = String(dateObj.getDate()).padStart(2, '0'); var fixedTime = "09:00:00"; var timezone = "+08:00"; return `${yyyy}-${mm}-${dd}T${fixedTime}${timezone}`; } function setFAQSchema() { var schemaData_FAQ = { "@context": "http://schema.org", "@type": "FAQPage", "mainEntity": [] }; var questionList = []; for (var i = 0; i < document.querySelectorAll(".FAQ_Schema_Q").length; i++) { var item = { "@type": "Question", "name": document.querySelectorAll(".FAQ_Schema_Q")[i].textContent.trim(), "acceptedAnswer": { "@type": "Answer", "text": document.querySelectorAll(".FAQ_Schema_A")[i].textContent.trim() } }; questionList.push(item); } extend(schemaData_FAQ.mainEntity, questionList); addSchema(schemaData_FAQ); }