|
|
|||
|
||||
OverviewArtificial intelligence is redefining the scale, architecture, and performance expectations of modern data centres. Training large ML models demand infrastructure capable of moving massive data sets through highly parallel, compute-intensive environments - where traditional data centre designs simply can't keep up. AI Data Center Network Design and Technologies is a comprehensive, vendor-agnostic guide to the design principles, architectures, and technologies that power AI training and inference clusters. Written by leading experts in AI Data centre design, this book helps engineers, architects, and technology leaders understand how to design and scale networks purpose-built for the AI era. You'll learn how to: Architect scalable, high-radix network fabrics to support xPU (GPE, TPU)-based AI clusters Integrate lossless Ethernet/IP fabrics for high-throughput, low-latency data movement Align network design with AI/ML workload characteristics and server architectures Address challenges in cooling, power, and interconnect design for AI-scale computing Evaluate emerging technologies from the Ultra Ethernet Consortium (UEC) and their affect on future AI data centres Apply best practices for deployment, validation, and performance measurement in AI/ML environments With broad coverage of both foundational concepts and emerging innovations, this book bridges the gap between network engineering and AI infrastructure design. It empowers readers to understand not only how AI data centres work, but why they must evolve. Full Product DetailsAuthor: Mahesh Subramaniam , Michal Styszynski , Himanshu TambakuwalaPublisher: Pearson Education (US) Imprint: Addison Wesley Dimensions: Width: 19.00cm , Height: 2.00cm , Length: 23.00cm Weight: 0.660kg ISBN: 9780135436288ISBN 10: 0135436281 Pages: 384 Publication Date: 04 April 2026 Audience: Professional and scholarly , Professional & Vocational Format: Paperback Publisher's Status: Forthcoming Availability: Available To Order Limited stock is available. It will be ordered for you and shipped pending supplier's limited stock. Table of ContentsPart 1: AI/ML Data Center Design Workloads and Requirements Chapter 1 Wonders in the Workload Chapter 2 “The Common-Man View” of AI Data Center Fabrics Part 2: AI/ML Data Center Design Concepts Chapter 3 Network Design Considerations Chapter 4 Optics and Cables Management Chapter 5 Thermal and Power Efficiency Considerations Part 3: AI/ML Data Center Technology Requirements Chapter 6 Efficient Load Balancing Chapter 7 RoCEv2 Transport and Congestion Management Chapter 8 IP Routing for AI/ML Fabrics Chapter 9 Storage Network Design and Technologies Part 4: KPIs and Performance Monitoring Chapter 10 AI Network Performance KPIs Chapter 11 Monitoring and Telemetry Part 5: UEC – Ultra Ethernet Consortium Chapter 12 Ultra Ethernet Consortium (UEC) CONCLUSION Chapter 13 Scale-Up Systems Chapter 14 Conclusion Appendix A: Questions and Answers Appendix B: AcronymsReviewsAuthor InformationMahesh Subramaniam is a proven leader in AI data centres and next-generation networking technologies. He played a key role in defining the advanced software roadmap for AI fabrics, which are now deployed in production networks across various AI data centres worldwide. As the Senior Director of Product Management for AI Data Centers at HPE Juniper Networks, he leads cutting-edge innovations in AI infrastructure and cloud-scale solutions, optimised for both scale-up and scale-out architectures. Mahesh is also an inventor with several technology patents and a recognised speaker at global forums, including the UEC Summit, OCP, and Tokyo MPLS forum. His work has earned him accolades, including the CEO Excellence Award, the Record High Business Award, and the Star Award for the Cloud DC Reference Architecture. With a remarkable history in the networking industry, Mahesh has a strong track record of leading products and managing technical and business strategies across cross-functional teams. Michal Styszynski is a Product Management Director in the Data Center Networks Business Unit (DC BU) at HPE Juniper Networking. Michal has been with Juniper Networks for more than 13 years. Before his current role, he was a Technical Marketing Engineer (TME) in the DC BU and a Technical Solution Consultant at Juniper. In these roles, he handled data centre projects for large-scale enterprises and federal networks and worked closely with Tier 2 cloud and telco-cloud service providers. Before joining Juniper, he spent around 10 years working at Orange, FT R&D, and TPSA Polpak engineering. Michal graduated from the Electronics & Telecommunications department at Wroclaw University of Science & Technology with a master's degree in engineering. He also holds an MBA from Paris Sorbonne Business School and is a JNCIE-DC#523, as well as PEC, PLC, and PMC certified from the Product School in San Francisco. Himanshu Tambakuwala is a highly accomplished networking expert and certified technical architect whose experience spans the entire product lifecycle[md]from hands-on engineering to product strategy. He is a JNCIE holder in Data Center and Service Provider technologies and an inventor with four granted technology patents and two additional patents currently filed. As a Product Manager at Juniper Networks, Himanshu was instrumental in defining the feature roadmap for network fabrics that power cutting-edge AI/ML data centres. Tab Content 6Author Website:Countries AvailableAll regions |
||||