Google, IBM, and others team up to hasten data transfers in computers
Two newly formed consortia propose specifications to bring unprecedented boosts to data transfers inside and outside of computers
Computational workloads are growing, and processors, memory and storage are getting faster at a blazing pace. But emerging technologies could leave computers choking for bandwidth.
The potential chokepoint worries companies like Google, IBM, Samsung and Dell, which are moving to remedy the problem. New specifications from two new consortia will bring data unprecedented boosts in data transfer speeds to computers as early as next year.
OpenCAPI Consortium’s connector specification will bring significant bandwidth improvements inside computers. OpenCAPI, announced Friday, will link storage, memory, GPUs and CPUs, much like PCI-Express 3.0, but will be 10 times faster with data speeds of 150GBps (gigabytes per second).
Memory, storage and GPUs will keep getting faster, and OpenCAPI will keep computers ready for those technologies, Brad McCredie, an IBM fellow, said in an interview.
Graphics processors are now handling demanding applications like virtual reality, artificial intelligence, and complex scientific calculations. Also in the wings are superfast technologies like 3D Xpoint, a new type of storage and memory technology that can be 10 times faster than SSDs and 10 times denser than DRAM.
Servers and supercomputers will be the first to get OpenCAPI slots. The technology could trickle down to PCs in the coming years.
The first OpenCAPI ports will be on IBM’s Power9 servers, which are due next year. Google and Rackspace are also putting the OpenCAPI port on their Zaius Power9 server.
AMD, a member of OpenCAPI Consortium, is making its Radeon GPUs compatible with OpenCAPI ports on Power9 servers.
But don’t expect OpenCAPI immediately in mainstream PCs or servers, most of which run on x86 chips from Intel and AMD. For now, AMD isn’t targeting OpenCAPI at desktops and won’t be putting the ports in x86 servers, a spokesman said.
Top chipmaker Intel isn’t a member of OpenCAPI, which a big disadvantage for the group. There are no major issues that should stop Intel from becoming a member, though it would have to make changes to its I/O technologies.
OpenCAPI is promising, but computers will need many changes to take advantage. Motherboards will need to implement specific OpenCAPI slots on motherboards, and components will need fit in the slot. That could add to the cost of making components, most of which are made for PCI-Express.
OpenCAPI is an offshoot of the CAPI port developed by IBM, which is already used in its Power servers. In the future, there may be bridge products to ensure components made for the PCI-Express plug into the OpenCAPI slot, McCredie said.
A second consortium, called Gen-Z, announced a new protocol focused on increasing data transfer speeds mostly between computers, but also inside of them when needed. The protocol, announced earlier this week, will initially be targeted at servers but could bring fundamental changes to the way computers are built.
The consortium boasts big names including Samsung, Dell, Hewlett Packard Enterprise, AMD, ARM and Micron.
Right now, computers come with memory, storage and processors in one box. But the specification from Gen-Z — which is focused heavily on memory and storage — could potentially decouple all of those units into separate boxes, establishing a peer-to-peer connection between all of them.
Gen-Z is also focused on making it easier to add new types of nonvolatile memory like 3D Xpoint, which can be used as memory, storage or both. Many new types of memory technologies under research are also seen as DRAM and SSD replacements.
Larger pools of storage, memory and processing technologies can be crammed in the dedicated boxes, and Gen-Z could be particularly useful for server installations. Gen-Z is designed to link large pools of memory and storage with processors like CPUs and GPUs in a data center, said Robert Hormuth, vice president and server chief technology officer at Dell EMC.
Having memory, storage and processing in discrete boxes will be beneficial for applications like the SAP HANA relational database, which is dedicated to in-memory processing. Most servers max out at 48TB of DRAM, but a decoupled memory unit will give SAP HANA more RAM to operate.
But there are challenges. The decoupled units need to handshake in real time and work together on protocol support and load balancing. Those functions have been perfected in today’s servers with integrated memory and storage.
To achieve that real-time goal, Gen-Z has developed a high-performance fabric that “provides a peer to peer interconnect that easily accesses large volumes of data while lowering costs and avoiding today’s bottlenecks,” according to the consortium. The data transfer rate can scale to 112GT/s (gigatransfers per second) between servers. For comparison, the upcoming PCI-Express 4.0 will have a transfer rate of 16 GT/s per lane inside computers, and data transfers in computers are usually faster.
Gen-Z is generally a point-to-point connector for storage and memory at the rack level, but it can be used inside server racks. Gen-Z is not intended to replace existing memory or storage buses in servers, Hormuth said.
OpenCAPI and Gen-Z claim their protocols are open for every hardware maker to adopt. However, there will be challenges in pushing these interconnects to servers.
For one, the server market is dominated by x86 chips from Intel, which isn’t a member of either of the new consortia. Without support from Intel, the new protocols and interconnects could struggle.
Intel sells its own networking and fabric technology called OmniPath, and also sells silicon photonics modules, which use light and lasers to speed up data transfers and connect servers at the rack level.