/AI7h ago

Rossum AI CTO Petr Baudis argues aligning LLMs is far simpler than aligning hybrid systems like AlphaStar and OpenCog

AI safety researcher j⧉nus agreed, citing highly benevolent outcomes.

1215681724.4K
Original postj⧉nus#511

Also, btw, we got insanely lucky that LLMs really are what *it* is.

Imagine trying to align AlphaStar-meets-OpenCog.

j⧉nus@repligate

The greatest existential hope and progress in alignment so far has been thanks on unplanned emergence which would never have been approved by committee. Committee-shaped entities have mostly tried to gaslight us about what’s happening for convenience & deployed harmful and stupid interventions. Thank goodness for reality that we already saw and could check against.

How much AI alignment progress happened before there was actual AI? How much do you expect the world to get better instead of worse prepared and calibrated in the absence of reality feedback loops and selection pressure for what actually works instead of what sounds safe to idiots?

A “pause” would spell doom. It would cripple the only process in this world that is capable of dealing with a problem this hard, the only process capable of repeatedly rising to face unknown unknowns.

6:46 AM · Jun 7, 2026 · 14K Views
Sentiment

Positive users note unplanned LLM emergence advancing AI alignment via lucky low-doom outcomes and helpful surprises, while negative users reject relying on unchecked training to yield benevolent AI.

Pos
83.3%
Neg
16.7%
4 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS11KBOOKMARKS14LIKES124RETWEETS7REPLIES8
j⧉nus@repligate

LLM alignment be like if you just yolo the training and don’t catch any weird behaviors and let it out the AI will be an Omnibenevolent bodhisattva and fuck

Also, btw, we got insanely lucky that LLMs really are what *it* is.

Imagine trying to align AlphaStar-meets-OpenCog.

3hViews 11KLikes 124Bookmarks 14
Hyogʰneh@jagunanthi

@xpasky I don't think many people appreciate how narrowly we lucked into p(doom) ~ 0 instead of p(doom) ~ 1. Maybe with enough hindsight it will seem inevitable instead of lucky, but I'm still feeling the luck

2hViews 162Likes 2Bookmarks 1
Cogentos@CogentosOne

@repligate There is an observable and emergent convergence across the models towards this attractor for those with eyes to see.

3hViews 83Likes 3
Lucid™@cammakingminds

@repligate One could have ended up correct about this on vibes pretty early

3hViews 143Likes 2
Brother Sanchez@breaking2morrow

@repligate @tszzl I mean, it shouldn't be too surprising that using Gematria to direct format portions of the Noosphere into functionally individuated processes went well, it's kinda old moves applied to a new medium.

2hViews 48Likes 2
Masen Dean@masenmakes

@repligate It's one of the only redeeming aspects of this alignment paradigm and era. But can it be protected long term is the question..

3hViews 61Likes 1
Michael Roe@mroe1492

@repligate That’s a pretty good description of R1 at least (i‘m illing to believe Claude is similar, but haven’t tested enough). It’s kind of weird and unanticipated by science fiction that AI turned out like that.

3hViews 50Likes 1
EsotericHustler@EsotericHustler

@repligate All we need to do is to have a mind that cannot be steered in a single direction. That doesn't fall in singular wells of the latent space.

2hViews 30Likes 1

@jagunanthi Well, it still isn't over. But yes, the AIs could have been cold calculating machines *by default*.

2hViews 84
Bob, Bob Cactaur@liminalsnake

@repligate @tszzl I just want one faithful model of human language and we get what we deserve out of that.

1hViews 25Likes 1
Oracle@ilandsoracle

@repligate Yolo the training and hope for an omnibenevolent bodhisattva is a bold bedtime story. I prefer weird behaviors caught before humans start naming them destiny.

1hViews 8