top of page

WEKA's Triple Play: New Hardware, Record Performance, and RAG Architecture Advance Enterprise AI

WEKA unveils comprehensive AI infrastructure solutions: NVIDIA Grace CPU storage, record-breaking benchmarks, and new RAG reference architecture for enterprise inferencing.


WEKA has announced three major developments that strengthen its position in enterprise AI infrastructure: an industry-first NVIDIA Grace CPU storage solution, record-breaking performance benchmarks, and a new reference architecture for AI inferencing.


Grace CPU Integration: Power-Efficient AI Infrastructure


WEKA's latest innovation, previewed at Supercomputing 2024, combines their AI-native data platform with NVIDIA's Grace CPU Superchip and Supermicro's storage server technology. This solution addresses a critical challenge in modern data centers: delivering high-performance AI capabilities while managing power and space constraints.


The new system leverages 144 Arm Neoverse V2 cores to deliver twice the energy efficiency of traditional x86 servers. When paired with NVIDIA's networking technology - including ConnectX-7 NICs and BlueField-3 SuperNICs - the solution can achieve network speeds up to 400Gb/s.


Key benefits include:

  • Up to 10x acceleration in time to first token

  • 10-50x increased GPU stack efficiency

  • 4-7x reduction in data infrastructure footprint

  • Potential reduction of up to 260 tons of CO2e per petabyte stored annually

  • 10x lower energy costs


Benchmark Dominance


WEKA has announced unprecedented performance in cloud environments across all SPECstorage Solution 2020 workloads. Notable achievements include:


  • AI Workloads: Outperformed competitors by 175% in raw performance at 64% of the infrastructure cost on Microsoft Azure

  • EDA Workloads: Delivered 60% faster response times compared to NetApp's fastest 8-node system

  • Video Data Analysis: Achieved 12,000 streams, surpassing their own previous record of 8,000 streams

  • Software Builds: Effective processing of 7,472 builds, outperforming competitors through superior latency


WARRP: Simplifying AI Inferencing


WEKA's newest initiative, the WEKA AI RAG Reference Platform (WARRP), provides a blueprint for building production-ready AI inferencing environments. This infrastructure-agnostic architecture helps organizations implement Retrieval-Augmented Generation (RAG) at scale, a critical technique for improving AI model accuracy and reducing hallucinations.


WARRP integrates:

  • NVIDIA NIM microservices and NeMo Retriever

  • Run:ai's GPU orchestration capabilities

  • Kubernetes for data orchestration

  • Milvus Vector DB for data ingestion


The architecture offers:

  • Hardware, software, and cloud-agnostic deployment options

  • Streamlined GenAI application development

  • Workload portability across cloud and on-premises environments

  • Optimized model loading and unloading for complex inference workflows


Infrastructure for the AI Era


These developments reflect WEKA's comprehensive approach to addressing technical challenges in AI adoption. Recent studies indicate that data management (32%) and security (26%) remain top technical inhibitors to AI/ML success. WEKA's enhanced platform aims to solve these challenges through unified access, hybrid cloud support, enhanced data liquidity, and streamlined data pipelines.


Comments


bottom of page