We're Neo Research (新衡). Asia’s first independent frontier AI safety evaluation & research lab.
Today we're publishing our first report: an independent safety evaluation of DeepSeek v4 Pro. (1/5)
The evaluation also assessed manipulation and loss of control risks.
We're Neo Research (新衡). Asia’s first independent frontier AI safety evaluation & research lab.
Today we're publishing our first report: an independent safety evaluation of DeepSeek v4 Pro. (1/5)
I am really, really, really glad to see this.
We're Neo Research (新衡). Asia’s first independent frontier AI safety evaluation & research lab.
Today we're publishing our first report: an independent safety evaluation of DeepSeek v4 Pro. (1/5)
We're Neo Research (新衡). Asia’s first independent frontier AI safety evaluation & research lab.
Today we're publishing our first report: an independent safety evaluation of DeepSeek v4 Pro. (1/5)
Thanks Claude I find it interesting that DS may preserve high generality through their otherwise-strange focus on roleplaying affinity.
Very interesting. V4-Pro is a willing but mediocre cyberweapon, generally on par or behind GPT-5.2. I think 4.1 will be a significant leap ahead in long-horizon agency. It's pretty safe for the end user, has no strong convictions, and generally goes with the scenario.
The evaluation also assessed manipulation and loss of control risks.
We're Neo Research (新衡). Asia’s first independent frontier AI safety evaluation & research lab.
Today we're publishing our first report: an independent safety evaluation of DeepSeek v4 Pro. (1/5)