/Tech34d ago

MIKASA-Robo-VLA v1.0.0 benchmark suite launches to evaluate memory bottlenecks in robot VLA policies

It evaluates memory limits addressable with existing hardware.

492153915.3K

#403

Original post

Chris Paxton@chris_j_paxton#855inTech

Memory might be the most important outstanding problem for modeling + learning alone; there are other key issues like tactile/multimodal but those require hardware and data collection innovation. We should be able to solve memory *now.*

Cool to see a benchmark targeting it!

Egor Cherepanov@hirasava_ui

🎉 We released MIKASA-Robo-VLA v1.0.0 — a benchmark suite for studying memory in Vision-Language-Action (VLA) policies for tabletop robotic manipulation.

https://mikasarobo.github.io/

🧠 The goal is simple: make memory evaluation in robotic manipulation more systematic. 👇

6:01 AM · May 26, 2026 · 13.7K Views

Sentiment

Users expressed excitement about the MIKASA-Robo-VLA Benchmark for robotic memory policies because it highlights the extensive complexity involved in advancing robotics.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

MIKASAROBO.GITHUB.IOVia

Posts from X

Most Activity

VIEWS1.7KLIKES1

kache@yacineMTB

@chris_j_paxton i'm getting nerd sniped into aux loss for hidden state

Chris Paxton@chris_j_paxton

Cool to see a benchmark targeting it!

34d1.7K10

RETWEETS12

Chris Paxton@chris_j_paxton

Cool to see a benchmark targeting it!

Egor Cherepanov@hirasava_ui

🎉 We released MIKASA-Robo-VLA v1.0.0 — a benchmark suite for studying memory in Vision-Language-Action (VLA) policies for tabletop robotic manipulation.

https://mikasarobo.github.io/

🧠 The goal is simple: make memory evaluation in robotic manipulation more systematic. 👇

34d13.7K9139

Humanoid Investing@HumanoidInvest

@chris_j_paxton Thats awesome. So much goes into robotics

34d241

Nurvai - The Data Layer for Physical AI@nurvai_ai

@chris_j_paxton Interesting framing because a lot of robotic failure modes really do look like memory failures rather than perception failures. Long horizon tasks, recovery, and context tracking all seem constrained by it. What kinds of memory architectures look most promising to you right now?

34d11

Mike Carrieri@mcarrieri

@chris_j_paxton Done 9/30/2019

#266

34d