Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)
Adler and Hedley also co-authored a report on xAI safety for SpaceX investors.
Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)
Positive users praise Guidelight's new AI safety standards as concrete, practical, and much needed, while negative users question the focus on xAI.
No Digg Deeper questions have been answered for this story yet.
I'm pretty excited about Guidelight.
Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)
Each of Guidelight's standards is built around concrete practices, supported by experts, that help to achieve an important principle.
The standards give AI companies a target to meet, and an incentive for all to be safer. Read more at our site: https://www.guidelight.ai/standards
Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)
Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)

@OutsourcedLogic thanks! yeah Principle 4 has some stuff to that effect, though it's tricky. for instance, we ask companies to define what their set of 'absolute' boundaries is, which are strict human-in-the-loop. any takes on things that should definitely be in that category?
very reasonable proposals!
Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)
those are great proposals!
Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)
Very excited about the new org and the Control standard in particular.
I think it's both very reasonable and also quite feasible to implement!
Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)
Steven and Page are two of the wisest, most capable and most experienced people in this space. I'm tremendously excited to see their work on AI safety standards and practices - much needed, and no better people for the job.
Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)
I'm a big fan of these proposals! They are concrete, actionable steps frontier AI companies can take *today* to preserve control of their internal agents.
Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)
Hard to think of someone better placed for this work. Congrats @sjgadler!
Some personal news: I've started a new AI safety standards org, and our first two standards are out today.
We're called Guidelight, co-founded with fellow ex-OpenAI safety researcher, Page Hedley. (1/n)

@sjgadler one small add - explict call out for human in the loop especially when a human is being impacted by the choice the AI will make

Website: http://guidelight.ai Standards: http://guidelight.ai/standards xAI report: http://spacexai-risks.org
@S_OhEigeartaigh this is very very kind, thank you - blushed a bit at it
Steven and Page are two of the wisest, most capable and most experienced people in this space. I'm tremendously excited to see their work on AI safety standards and practices - much needed, and no better people for the job.

@wrenclay w00t! especially in the market for takes on 'next areas we should cover'

Ryan and Redwood's work have been a big inspiration for me in thinking about important practices yeah :-) we link to some of it in one of the directions-for-development (specifically about 'making deals')
If you use this expanded link, it's easier to see the full details: https://www.guidelight.ai/control?expand=true

@mealreplacer I feel you, I’d fully logged out on my phone for like a week, and yet here I am

@wyatt_benno Yup! I'm especially excited about tamper-evident logging (it's under Principle 1). I worry that even if there's a serious safety incident, if labs haven't kept records like this along the way, they'll be disbelieved :/ wrote a bit here https://www.lesswrong.com/posts/ETpRwxFfuBYo7JMyd/sjadler-s-shortform?commentId=vuhqXqvHn9jSbm99g

@sjgadler @idavidrein I read some. Looks really good. I’m very impressed. Are you evaluating models or company’s? I would consider making a eval / chart showing OpenAI, Anthropic, etc and how they’ve done on your guidelines over time.
@GarrisonLovely this is very kind of you, thank you :-)
I've consistently been impressed with Steven's writing, clarity of thought, and expertise. It's really valuable to have ex-insiders be starting orgs like this, and I'm excited to see what they do!

This is interesting! Diving in on the verification part… did you know that with cryptography you can make succinctly verifiable proofs? I.e prove not only a guardrail ran.. by checking logs and such, but in under 1s for potentially thousands of guardrail checks.
It’s always fun to talk cryptography with the AI labs… as the two worlds need more communication.