The new inferencing platform can fully analyze AI models, deploy them on the most ideal AI accelerators, and dynamically balance those workloads across multiple regions.
SAN JOSE, Calif., March 18, 2025 (GLOBE NEWSWIRE) -- NVIDIA GTC Conference - Cirrascale Cloud Services, the leading provider of innovative cloud and managed solutions for AI and high-performance computing (HPC), today announced the early preview of its groundbreaking Cirrascale Inference Platform-an enterprise inference-as-a-service solution engineered to optimize AI model performance and cost for high token volume applications.
Specifically designed for large-scale model deployment, the platform will launch with the NVIDIA Blackwell architecture, featuring the NVIDIA HGX B200 and NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. The platform automatically selects the optimal AI accelerator to adapt to changing model requirements, user demands, and workflow shifts, ensuring unmatched performance and substantial cost savings compared to traditional methods. Additionally, it works to dynamically balance workloads across regions, helping to smooth out peak demands, enhance operational efficiencies, and reduce costs.
"Enterprise AI adoption is accelerating, and while hyperscalers are suitable for initial validation and model training, they can introduce additional complexities and costs with larger volume usage,” said Nick Pandher, vice president of product at Cirrascale Cloud Services. "With the launch of the Cirrascale Inference Platform, customers unlock access to a high-performance, seamlessly deployable cloud inference solution that intelligently optimizes their AI configurations for both efficiency and cost-effectiveness.”
Engineered with enterprise-grade security, the Cirrascale Inference Platform features isolated pipelines that integrate seamlessly with both hyperscalers and on-premises environments. Leveraging dedicated bare-metal performance, it eliminates the risks of multi-tenancy common to AI model pipelines delivered through virtualized infrastructure, ensuring high token throughput and robust security. The platform is designed to simplify management and scaling allowing enterprises to maintain low-volume models on-premises while offloading demanding workloads to a high-performance, cost-efficient solution-complete with advanced features like Retrieval Augmented Generation (RAG) for enhanced model performance.
Get the latest news
delivered to your inbox
Sign up for The Manila Times newsletters
By signing up with an email address, I acknowledge that I have read and agree to the Terms of Service and Privacy Policy.
"As we moved our AI models to scaled inference, we were looking for a cloud services provider that optimized for the latest AI accelerators that offer low-precision workloads while providing predictable pricing,” said Martin Woodall, CEO at DroneData. "Our robotics solutions require fast and predictable inference with real-time results. The new FP4 technology for inference was critical to get the performance we need.”
"The NVIDIA Blackwell architecture delivers exceptional performance and scalability for AI inferencing," said Dave Salvator, director of accelerated computing products at NVIDIA. "The new Cirrascale instances with NVIDIA HGX B200 as the leading system for this platform, will empower enterprises to seamlessly deploy and optimize large-scale AI models with efficiency and cost-effectiveness."
The platform will be available across multiple U.S. regions and select international regions to reduce latency and optimize performance for both hyperscaler and on-premises application workflows. Early access launches in April, with wider access available in the summer.
For more information, about the new Cirrascale Inference Platform, please visit: https://inference.cirrascale.com or stop by booth # 2103 at GTC to see a demo of AI workflows using the new platform.
For more information about Cirrascale, please visit www.cirrascale.com
About Cirrascale Cloud Services
Cirrascale Cloud Services is a leading cloud and managed services provider dedicated to deploying state-of-the-art compute resources and high-speed storage solutions at scale. Our AI Innovation Cloud is purpose-built to enable clients to scale their training and inferencing workloads for generative AI, large language models, and high-performance computing. To learn more about Cirrascale Cloud Services and its unique cloud offerings, please visit https://cirrascale.com or call (888) 942-3800.
For more information contact:
BOCA Marketing Agency for Cirrascale
Email: [email protected]