Arista Delivers Holistic AI Solutions in Collaboration with NVIDIA
Optimal GPU and Network Coordinated Performance
Need for Uniform Controls
As the size of AI clusters and large language models (LLMs) grows, the complexity and sheer volume of disparate parts of the puzzle grow apace. GPUs, NICs, switches, optics, and cables must all work together to form a holistic network. Customers need uniform controls between their AI servers hosting NICs and GPUs, and the AI network switches at different tiers. All these elements are reliant upon each other for proper AI job completion but operate independently. This could lead to misconfiguration or misalignment between aspects of the overall ecosystem, such as between NICs and the switch network, which can dramatically impact job completion time, since network issues can be very difficult to diagnose. Large AI clusters also require coordinated congestion management to avoid packet drops and under-utilization of GPUs, as well as coordinated management and monitoring to optimize compute and network resources in tandem.
Introducing the Arista AI Agent
At the heart of this solution is an Arista EOS-based agent enabling the network and the host to communicate with each other and coordinate configurations to optimize AI clusters. Using a remote AI agent, EOS running on Arista switches can be extended to directly-attached NICs and servers to allow a single point of control and visibility across an
“Arista aims to improve efficiency of communication between the discovered network and GPU topology to improve job completion times through coordinated orchestration, configuration, validation, and monitoring of NVIDIA accelerated compute, NVIDIA SuperNICs, and Arista network infrastructure,” said
End-to-End
This new technology demonstration highlights how an Arista EOS-based remote AI agent allows the combined, interdependent AI cluster to be managed as a single solution. EOS running in the network can now be extended to servers or SuperNICs via remote AI agents to enable instantaneous tracking and reporting of performance degradation or failures between hosts and networks, so they can be rapidly isolated and the impact minimized. Since EOS-based network switches are constantly aware of accurate network topology, extending EOS down to SuperNICs and servers with the remote AI agent further enables coordinated optimization of end-to-end QoS between all elements in the
“Best-of-breed Arista networking platforms with NVIDIA’s compute platforms and SuperNICs enable coordinated AI Data Centers. The new ability to extend Arista’s EOS operating system with remote AI agents on hosts promises to solve a critical customer problem of AI clusters at scale, by delivering a single point of control and visibility to manage AI availability and performance as a holistic solution,” said
Arista will demonstrate the AI agent technology at the Arista IPO 10th anniversary celebration in NYSE on
If you are a member of the analyst or financial community and are interested in attending the NYSE event, please register here.
To read more about the new AI Centers check out CEO and Chairperson
About Arista
ARISTA and EOS are among the registered and unregistered trademarks of
View source version on businesswire.com: https://www.businesswire.com/news/home/20240529365397/en/
Media Contact
Corporate Communications
Tel: (408) 547-5798
amanda@arista.com
Investor Contact
Investor Relations
Tel: 408-547-5885
liz@arista.com
Source: