Gimlet Labs Raises $80M to Run AI Across All Available Hardware Simultaneously
A Stanford adjunct professor and serial founder has raised $80 million in Series A funding for an idea that could reshape how AI runs in the cloud. Gimlet Labs has built what it claims is the world's first "multi-silicon inference cloud" — software that lets a single AI workload run simultaneously across widely different hardware types.
The round was led by Menlo Ventures. The company was founded by Zain Asgar alongside co-founders Michelle Nguyen, Omid Azizi, and Natalie Serrino.
The core problem they're solving is straightforward. An AI agent typically chains together many steps, and each step requires different hardware: inference is compute-bound, decode is memory-bound, and tool calls are network-bound. No single chip does everything optimally. The result: large portions of installed AI hardware sit idle.
"Apps are only using the existing hardware deployed somewhere between 15 to 30 percent of the time," Asgar told TechCrunch. "You're wasting hundreds of billions of dollars because you're just leaving idle resources."
Gimlet Labs' solution is orchestration software that slices up agentic workloads and distributes them simultaneously across NVIDIA, AMD, Intel, ARM, Cerebras, and d-Matrix chips. The company claims it reliably speeds AI inference up by 3x to 10x for the same cost and power usage.
McKinsey estimates data center spending could tally nearly $7 trillion by 2030. At that scale, optimizing existing hardware is just as important as acquiring new chips. Gimlet Labs is betting the industry is ready to pay for a software layer that does exactly that.
📬 Likte du denne?
AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.