- Microsoft-backed startup introduces GPU-free alternate options for generative AI
- DIMC structure delivers an ultra-high reminiscence bandwidth of 150 TB/s
- Corsair helps transformers, agentic AI, and interactive video technology
d-Matrix Inc., a {hardware} startup primarily based in Santa Clara, California, has launched its first AI processor, Corsair, which is aimed toward enhancing AI inference.
Backed by Microsoft and leveraging cutting-edge expertise, Corsair eschews conventional GPUs and costly high-bandwidth reminiscence (HBM), delivering important efficiency and value advantages.
Corsair is presently obtainable to early-access prospects, with broader availability deliberate for the second quarter of 2025.
Corsair’s efficiency redefines AI inference
The Corsair processor is purpose-built to deal with demanding AI inference duties, notably for generative AI fashions. For instance, it achieves 60,000 tokens per second at 1 ms per token when operating Llama3 8B in a single server.
In additional resource-intensive situations, akin to with Llama3 70B fashions, Corsair delivers 30,000 tokens per second at 2 ms per token in a single rack, translating into substantial financial savings in power and operational prices in comparison with conventional GPU-based options.
The processor is constructed on Nighthawk and Jayhawk II tiles, utilizing a 6nm manufacturing course of. Every Nighthawk tile integrates 4 neural cores and a RISC-V CPU, tailor-made to help large-model inference with digital in-memory computation (DIMC) and versatile datatype processing, together with block floating level (BFP).
Corsair adopts chiplet packaging, integrating reminiscence and computation to maximise effectivity. It conforms to the industry-standard PCIe Gen5 full top full-length card type issue and could be paired with DMX Bridge playing cards for scalable efficiency. Every card is powered by 2400 TFLOPs of 8-bit peak computing, together with 2GB of built-in efficiency reminiscence and as much as 256GB of off-chip reminiscence capability.
It is very important word that Micron Expertise, a key accomplice of Nvidia, can also be collaborating with d-Matrix.
Initially set to launch in late 2023, d-Matrix reconfigured its structure in response to the surging demand for generative AI. This pivot allowed Corsair to include enhancements tailor-made for transformer fashions and rising functions like agentic AI and interactive video technology.
“We noticed transformers and generative AI coming, and based d-Matrix to deal with inference challenges across the largest computing alternative of our time,” stated Sid Sheth, cofounder and CEO of d-Matrix.
“The primary-of-its-kind Corsair compute platform brings blazing quick token technology for top interactivity functions with a number of customers, making Gen AI commercially viable,” Sheth added.
By way of eeNews