Teknium Launches Noustiny on NousResearch Hermes for Automated Video Creation

Original post

Ufuk@UfukDegen

Video content creation sounds simple, but what if you don’t have time to:

• Write the script, • Prepare the visuals, • Generate the voiceover, • Create the subtitles, • And finally render the video?

This is why we built Noustiny on top of @NousResearch Hermes Agent by adding 12 generic Hermes tools + 13 generic Hermes skills, bringing the whole process into one single flow.

How does it work? Let’s take a closer look 👇

————

1- Story state: context, tree, motifs: Hermes had no built-in narrative-state primitive for tracking canon, branching story structure, and recurring motifs.

So we added three generic Hermes tools for this:

→ story_tree_graph: Manages the story tree structure. It handles operations like canon path, descendants, and splice insertion points. → narrative_context_builder: Walks the canon chain and returns the live context every narrative skill should reason against. This includes recent chain, mood, and character state. → motif_tracker: Remembers recurring motifs across the story arc. For example, a sword introduced in beat 2 can reappear meaningfully in later scenes.

————

2- Character / cast pipeline:

Hermes had no built-in primitive for cast extraction or character continuity.

So we added a four-tool character pipeline:

→ story_copyright_detector: Handles IP scrubbing. For example, “Iron Man” is converted into an IP-free character description before the image API ever sees it. → character_sheet_builder: Produces 1 to 4 characters. For each character, it creates an IP-free visual description and a hero-portrait prompt. These portraits become the reference frames used across later storyboard scenes. → character_registry_lookup: Finds a character by name inside the cast sheet and attaches the correct portrait reference to each beat. → character_alias_resolver: Resolves aliases like “Mr. Stark” into the main character name. This way, the same character keeps one portrait reference even if they appear under different names.

————

3- Voice pipeline:

Hermes had no built-in primitive for audio acquisition or voice cloning.

So we added the full voice chain, and the agent dispatches it autonomously in order:

→ narration_voice_director: The director-agent reads the seed + story and returns persona_label, search_query, and fallback_query. → voice_sample_builder: Uses yt-dlp + ffmpeg. It accepts a URL, an 11-character ID, or a free-text query. It runs ytsearch5 with dead-video tolerance and normalizes the audio to 24 kHz mono PCM. → voice_clone_synthesize: Wraps ElevenLabs IVC + timestamps. The voice ID is cached by reference SHA. Per-character alignment comes through the same audio call at no extra cost. → voice_clone_cleanup: Frees the cached voice ID after render so orphan voices do not accumulate.

————

4- Render:

Hermes had no built-in video-render entry.

So we added the final render tool:

→ noustiny_storybook: The agent dispatches it as the final step of the chain. One tool call drives the FastAPI render service end to end and emits the mp4.

————

5- Skills: 13 generic Hermes skills added into skills/creative/:

The branching engine in Noustiny works like a council of narrative skills.

Each skill is loaded by the gateway as a system prompt and orchestrated in this order:

→ narrative-brainstorm: Proposes 2 to 3 next-checkpoint options from the canon chain. → narrative-writer-assist: Writes a spliced insert beat that fits the parent and child. → narrative-continuity-critic: Audits downstream beats against the new insert. → narrative-rewriter: Updates the stale beats flagged by the continuity critic. → narrative-judge: Approves or rejects the rewrite against the original flow. → narrative-scene-qa: Checks each beat for consistency, length, and register. → narrative-writer: Finalizes the chosen branch as polished prose.

After one splice, this cascade walks downstream by itself until the canon becomes coherent again.

————

6- Visual + IP pipeline:

On the visual side, the goal is not just generating scenes. It is also preserving character continuity and IP safety.

This pipeline runs through these skills:

→ visual-prompt-builder: Turns a beat into an IP-free image prompt and reads the character-sheet references. → scene-composition: Defines shot framing, scene composition, and layout rules. → story-copyright-detector: Skill counterpart of the same-named tool. It can be used for direct slash-command invocation. → character-sheet-builder: Skill counterpart of the same-named tool. Defines cast extraction rules and the IP-free portrait-prompt format used to seed character consistency across the storyboard. → storybook-intro: Generates the cinematic intro page for the render.

————

7- Voice skill:

→ narration-voice-director: Defines persona reasoning rules and supports the decision logic behind the same-named voice tool.

————

8- Pattern:

Hermes baseline already had the gateway, agent loop, skill registry, and tool registry.

We extended that foundation with 12 generic Hermes tools + 13 generic Hermes skills and organized the system into four main pipelines:

• story-state • character continuity • voice • render

The important part is this: Noustiny is not a hardcoded system locked inside a single app. A Telegram bot, Discord bot, CLI session, or third-party Next.js app can call the same gateway and use the same tool + skill chains.

- No app glue. - No hardcoded prompts. - A drop-in, registry-compatible, agent-native video creation flow.

✅Github: https://github.com/UfukNode/Noustiny

4:55 PM · May 3, 2026 · 22.2K Views