Media Summary: Swyx and Vibhu chat with Nader Khalil ( and Kyle Kranen ( from NVIDIA ... Learn more about PyTorch → Learn more about Llama → LLaMa Recipes on Github ...

Scaling Ai At Inference The Road To Agent Driven Roi - Detailed Analysis & Overview

Swyx and Vibhu chat with Nader Khalil ( and Kyle Kranen ( from NVIDIA ... Learn more about PyTorch → Learn more about Llama → LLaMa Recipes on Github ...

Photo Gallery

Scaling AI at Inference: The Road to Agent-Driven ROI
AI Inference: The Secret to AI's Superpowers
Inference at Scale: The New Frontier for AI Infrastructure and ROI
LLM as a Judge: Scaling AI Evaluation Strategies
What is vLLM? Efficient AI Inference for Large Language Models
Accelerated LLM Inference With Apache Spark At Scale
Agent Inference at the "Speed of Light" — How NVIDIA moves like a $4.3 Trillion Startup
Scaling AI Model Training and Inferencing Efficiently with PyTorch
Workshop: Foundry: How to 10x AI Agent Price Performance with Inference Time Scaling
System Design: Architecting Scalable LLM Inference for AI Apps
Scaling AI on Hybrid Cloud for Production LLM Inference at Scale by Roberto Carratala
Scaling LLM Inference Globally: Novita AI + Vultr
Sponsored
View Detailed Profile
Scaling AI at Inference: The Road to Agent-Driven ROI

Scaling AI at Inference: The Road to Agent-Driven ROI

00:00:00 - Introduction to

AI Inference: The Secret to AI's Superpowers

AI Inference: The Secret to AI's Superpowers

Download the

Inference at Scale: The New Frontier for AI Infrastructure and ROI

Inference at Scale: The New Frontier for AI Infrastructure and ROI

AI

LLM as a Judge: Scaling AI Evaluation Strategies

LLM as a Judge: Scaling AI Evaluation Strategies

Ready to become a certified watsonx

What is vLLM? Efficient AI Inference for Large Language Models

What is vLLM? Efficient AI Inference for Large Language Models

Ready to become a certified watsonx

Sponsored
Accelerated LLM Inference With Apache Spark At Scale

Accelerated LLM Inference With Apache Spark At Scale

Large-

Agent Inference at the "Speed of Light" — How NVIDIA moves like a $4.3 Trillion Startup

Agent Inference at the "Speed of Light" — How NVIDIA moves like a $4.3 Trillion Startup

Swyx and Vibhu chat with Nader Khalil (https://x.com/naderlikeladder) and Kyle Kranen (https://x.com/KranenKyle) from NVIDIA ...

Scaling AI Model Training and Inferencing Efficiently with PyTorch

Scaling AI Model Training and Inferencing Efficiently with PyTorch

Learn more about PyTorch → https://ibm.biz/BdSx57 Learn more about Llama → https://ibm.biz/BdSx53 LLaMa Recipes on Github ...

Workshop: Foundry: How to 10x AI Agent Price Performance with Inference Time Scaling

Workshop: Foundry: How to 10x AI Agent Price Performance with Inference Time Scaling

And you see this now in a more general

System Design: Architecting Scalable LLM Inference for AI Apps

System Design: Architecting Scalable LLM Inference for AI Apps

In the

Scaling AI on Hybrid Cloud for Production LLM Inference at Scale by Roberto Carratala

Scaling AI on Hybrid Cloud for Production LLM Inference at Scale by Roberto Carratala

Scaling AI

Scaling LLM Inference Globally: Novita AI + Vultr

Scaling LLM Inference Globally: Novita AI + Vultr

Unlock high-performance LLM

How to Engineer AI Inference Systems [Philip Kiely] - 766

How to Engineer AI Inference Systems [Philip Kiely] - 766

In this episode, Philip Kiely, head of