4. The Centralized xAI 100k Cluster Explained

What It Is

A centralized xAI cluster is a traditional data center approach that uses high–end chips—such as Nvidia H100 GPUs—to deliver massive compute power. For example:

• Compute Performance: Each chip delivers about 67 TFlops.

• Scale: A 100k–chip cluster would provide immense processing power.

Cost Structure

• CAPEX:

• Estimated at around $6 billion to build a 100k–chip cluster (including hardware, data centers, cooling systems, etc.).

• OPEX:

• Compute Cost: Approximately $1.75 per chip per hour is charged to cover operating costs.

• Electricity, Cooling, Network, and Storage: Additional high–cost items add hundreds of millions per year to the operating budget.

• Total Annual OPEX: Could reach $1.7 billion per year.

Limitations

• High Capital Requirements:

A centralized cluster requires enormous upfront investments and heavy financing.



• Operational Complexity:

Managing a centralized cluster involves dealing with high electricity usage, significant network costs, and storage complexities.



• Scalability Issues:

Scaling a centralized system requires even more capital and increases operational risks.

Last updated