/Tech2h ago

Etched introduces Low-Voltage Inference architecture to sustain 80% utilization on trillion-parameter Mixture of Experts models

It runs processing blocks at under half standard voltage.

1171735K

#756

Original post

Zephyr@zephyr_z9#1695inTech

This is very cool if true Bad boi will be a monster at prefill

Etched@Etched

Introducing Low-Voltage Inference (LVI) for high throughput workloads.

Today, AI chips can't scale FLOPs without thermal throttling.

As FLOPs utilization increases, AI chips draw more power and downregulate clock speed. This often results in sustained inference throughput under half of peak FLOPs.

Chips in other industries solve the power problem by running at lower voltages. Bitcoin miners run at under 3x the voltage of AI chips!

We’ve designed a new architecture to run our chip’s math blocks at under half the voltage of most AI chips. This enables multiple times the FLOPs density of AI chips today.

We can run trillion parameter sparse MoEs at 80%+ peak FLOPs without thermal throttling.

Running LVI requires co-designing the entire cluster from the transistor to the token: new splittable math arrays, circuit techniques, novel tiling and scheduling algorithms, power delivery networks, VRM architectures, advanced packaging, cold plate designs, and more.

8:43 AM · Jun 30, 2026 · 5K Views

Sentiment

Users are excited about Etched's low-voltage inference architecture for AI chips because it highlights an impressive team and groundbreaking high-throughput design.

Pos

100.0%

Neg

0.0%

2 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS626REPLIES1

Lon()@Lon

@Etched What an absolutely stacked and jacked team.

1h626

RETWEETS17

Etched@Etched

Introducing Low-Voltage Inference (LVI) for high throughput workloads.

Today, AI chips can't scale FLOPs without thermal throttling.

As FLOPs utilization increases, AI chips draw more power and downregulate clock speed. This often results in sustained inference throughput under half of peak FLOPs.

Chips in other industries solve the power problem by running at lower voltages. Bitcoin miners run at under 3x the voltage of AI chips!

We’ve designed a new architecture to run our chip’s math blocks at under half the voltage of most AI chips. This enables multiple times the FLOPs density of AI chips today.

We can run trillion parameter sparse MoEs at 80%+ peak FLOPs without thermal throttling.

3h109.7K415106

Giedrius Trump@Trumpyla

@Etched Wow

43m289

Exergy Lab@ExergyLab

@Lon @Etched Wen IPO?

53m9