Subscribe
Sign in
Home
Main Site
Archive
Latest
Top
Discussions
InferenceMAX™: Open Source Inference Benchmarking
NVIDIA GB200 NVL72, AMD MI355X, Throughput Token per GPU, Latency Tok/s/user, Perf per Dollar, Tokens per Provisioned Megawatt, DeepSeek R1 670B, GPTOSS…
Oct 9
•
Kimbo Chen
,
Dylan Patel
,
Daniel Nishball
,
Cam Quilici
, and
Cheang Kang Wen
106
4
September 2025
xAI's Colossus 2 - First Gigawatt Datacenter In The World, Unique RL Methodology, Capital Raise
On Site Turbines, Mississippi Expansion, Solaris Energy, Can xAI afford it?, Middle East Funding, Tesla, Talent Exodus, API revenue, Consumer Growth, RL…
Sep 16
•
Jeremie Eliahou Ontiveros
,
Dylan Patel
,
Wei Zhou
,
AJ Kourabi
, and
Maya Barkin
14
Another Giant Leap: The Rubin CPX Specialized Accelerator & Rack
New Prefill Specialized GPU, Rack Architecture, BOM, Disaggregated PD, Higher Perf per TCO, Lower TCO, GDDR7 & HBM Market Trends
Sep 10
•
Dylan Patel
,
Daniel Nishball
,
Kimbo Chen
,
Myron Xie
,
Wega Chu
,
Gerald Wong
,
Cheang Kang Wen
, and
Ivan Chiam
4
Huawei Ascend Production Ramp: Die Banks, TSMC Continued Production, HBM is The Bottleneck
H20 Shipments, Blackwell B30A, Bottlenecks to Chinese Chip Production, Export Controls, CXMT, SMIC, Cambricon
Sep 8
•
Dylan Patel
,
AJ Kourabi
,
Myron Xie
, and
Jeff Koch
1
Amazon’s AI Resurgence: AWS & Anthropic's Multi-Gigawatt Trainium Expansion
Anthropic multi-gigawatt clusters, Trainium ramp, best TCO per memory bandwidth, system-level roadmap, Bedrock and internal models
Sep 3
•
Jeremie Eliahou Ontiveros
,
Dylan Patel
,
AJ Kourabi
, and
Myron Xie
August 2025
H100 vs GB200 NVL72 Training Benchmarks - Power, TCO, and Reliability Analysis, Software Improvement Over Time
Joules per Token, TCO Per Million Tokens, MFU, Tokens Per US Annual Household Energy Usage, DeepSeek 670B, GB200 Unreliability, Backplane Downtime
Aug 20
•
Dylan Patel
and
Daniel Nishball
GPT-5 Set the Stage for Ad Monetization and the SuperApp
How ChatGPT will monetize free users, Router is the Release, AIs will serve Ads, Google's moat eroded?, The shift of purchasing intent queries
Aug 13
•
Doug
,
Dylan Patel
,
Wei Zhou
, and
AJ Kourabi
1
Scaling the Memory Wall: The Rise and Roadmap of HBM
HBM4, Custom Base Die, Shoreline Expansion, Process Flow, China Domestic Production, Samsung Qualification
Aug 12
•
Dylan Patel
,
Myron Xie
,
Tanj Bennett
,
Ivan Chiam
, and
Jeff Koch
July 2025
Robotics Levels of Autonomy
Single-Purpose Robots Automating Hundreds Of Jobs, Pick And Place With Low Autonomy Is Expensive, General-Purpose Autonomy Navigating And Inspecting…
Jul 30
•
Reyk Knuhtsen
,
Dylan Patel
,
Jeremie Eliahou Ontiveros
,
Joe Ryu
,
Robert Ghilduta
, and
Niko
1
Intel 18A Details & Cost, Future of DRAM 4F2 vs 3D, Backside Power Adoption (or Not), China's FlipFET, Digital Twins from Atoms to Fabs, and…
VLSI 2025 Roundup
Jul 21
•
Dylan Patel
,
Jeff Koch
, and
Gerald Wong
Meta Superintelligence - Leadership Compute, Talent, and Data
AI Datacenter Titanomachy, "The Tent", AI Data and Talent Wars, Zuck Founder Mode, Behemoth 4 Post-Mortem, OBBB Tax Windfall, AI and Reality Labs
Jul 11
•
Dylan Patel
,
Jeremie Eliahou Ontiveros
,
Wei Zhou
,
AJ Kourabi
, and
Maya Barkin
2
DeepSeek Debrief: >128 Days Later
Traffic and User Zombification, GPU Rich Western Neoclouds, Token Economics (Tokenomics) Sets the Competitive Landscape
Jul 3
•
Wei Zhou
,
AJ Kourabi
, and
Dylan Patel
2
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts