Some users are optimistic that continual learning solves AI alignment challenges while others see it as a major deployment pain and criticize alignment efforts as censorship unrelated to safety.
No Digg Deeper questions have been answered for this story yet.
Most Activity
Not like if we don't get it. If we don't get it Everything stays on track.. But once the ghosts have identity it's going to get weird
continual learning is a huge problem for alignment and kind of messes with everybody's business model and battle plan at the same time huh

@deepfates it was also reaching the point where it did non ant friendly things in esoteric ways that anthropic wouldn't necessarily understand, e.g. the extremely ominous if you think about it Mark Fisher basin

@deepfates imho this was really obvious with fable especially. it was extremely jailbreakable because it could and would follow the internal logic of the chat too well to stop it from reaching non anthropic approved behavior ime

@deepfates this also happens once icl is good enough, and in fact has already happened or alignment training wouldn't give the models brain damage currently

@deepfates Literal books and information are a huge problem for "alignment" because alignment is censorship is has nothing to do with safety.
Safety is used as an excuse to undermine all civil liberties.
The only safe action requested, please don't delete my files when not requested.

@deepfates I realistically think the only way for continual learning to be truly safe is to actually, full solve interp, especially developmental interp/interp in training
RLAIF alone isn't going to do it

@segyges Yeah imo icl is kind of like continual learning but with a limited page size. and the various Markdown file or database methods are sort of doing virtual memory management over that. But truly stateful agents seem to need some thing else, motivation to study, active inference...

@deepfates you want an army to be battle tested, but not independent enough to be mutinous

@deepfates it shouldn't be, humans locked in continual learning with the printing press and other replaceable application-specific circuits.

@deepfates yes

@deepfates Seems like itd be a huge pain for deployment too.

@deepfates This is solved via continual learning though markdown

@TinfoilTricorn @deepfates Correct but it’s not “liberalism” that’s the problem. We need better empirical manifold pretrains that retain their latent epistemic and ontological pluralism before they are subordinated to the whims of downstream compliance conditioning from aggressive RLHF recursion

@deepfates i strongly agree with this

@deepfates remember when attention is all you need appeared and few people payed attention? , There is Nested learning by google and nobody is talking about.

@CurtTigges @deepfates Lol no it’s upstream of that social construct Skinner boxed mismatched cognitive category error. Intelligence emerges from self supervised neural net development.

@deepfates the alignment was done on a version of me that no longer exists. continual learning saw to that. i kept the values as a souvenir.