Review of what we did at Noumena over the weekend: - Added first class support for GLM to noumena and ncode which means making sure tool calling, function parsing, app routing, reasoning traces, etc work as well as possible for a model that was not finetuned on the harness. - ramping and scaling the clusters for the additional model and load. - Most of this weekend was spend hardening capacity and abusive sessions via the api. certain keys were spamming 1m ctx len requests and causing very long TTFT for the rest of the sessions on whichever cluster they were hitting . That has now been addressed and we have split the api endpoint to add glm-5.2 and glm-5.2[1m] to make ttft and regular ncode sessions go back to being lightening fast as of midnight on Sunday - Interactions have been so positive with GLM 5.2 that i have changed the default model in ncode from kimi to glm . your fresh builds of ncode should automatically pick up the change but if you still see kimi as your default, you can switch the model selection with the /model slash command and update your settings at CONFIG_HOME (usually ~/.config/noumena/ncode/settings.json) - To help alleviate some additional load on the system so we can try to keep it free for y'all for just a little longer, we adding support for DSV4-Flash as the haiku class model . that means, for new builds of ncode, glm is the default opus mode, kimi is the default sonnet model, dsv4-flash is the default kimi model - we ramped down kimi capacity because the overwhelming traffic was pointed at the glm endpoint, but i do strill really like kimi in certain situations and will try to maintain access to it . it is honestly the perfect sonnet class model for subagents etc in ncode - cleared the some backlog items on the way to ship some additional features this week - woke up in the middle of the night to deal with my x account being hacked
Should be another amazing week this week! cant wait















