Jun 10, 2026
Browser extension that captures M3U8 HLS live streams and writes them to local disk
It uses browser network interception to collect segmented video chunks from live playlists and concatenates the fragments into a single file on the user's machine. The technique mirrors long-standing download utilities and browser extensions without introducing new primitives, confining adoption to occasional users who need to archive specific web streams.
Manage shared AI coding agent skills in Notion databases and sync them locally across tools
The CLI installs a Notion-backed store that maps database pages to SKILL.md files and symlinks them into agent-specific directories for Claude, Cursor and others. This centralizes collaboration on skills without git workflows, appealing mainly to teams already using multiple AI coding agents who want live editing and selective installs rather than broad developer adoption.
Downloader and interactive viewer for the Nymeria multimodal egocentric human motion dataset
The repository supplies Python CLIs that fetch the 80 TB synchronized multi-device recordings, MHR/SMPL meshes, 3D object boxes, and language annotations, then render them together in a real-time viewer. Its unprecedented scale and first-of-kind egocentric multimodal capture make it a foundational resource for embodied-AI and AR/VR research groups rather than a general-purpose developer tool.
Trajectory-refined distillation trainer for on-policy LLM distillation with teacher-guided rollout revision
The implementation builds on verl to prepare rollouts, apply KL losses at the trajectory level, and produce refined targets y r for both OPD and OPSD pipelines across math and code tasks. The method introduces a new output-refinement stage rather than further token-level loss tweaks, so it will be adopted mainly by research teams already running large-scale distillation experiments.
FUSE filesystem that infers file contents from LLM latent space using only filenames
It mounts via FUSE so the source tree supplies only names while an LLM backend such as Gemini or Claude is prompted on every read to emit the corresponding bytes, with an in-memory LRU cache and base64 handling for binaries. The approach humorously extends the πfs concept into parametric memory, attracting attention from developers who enjoy infrastructure parodies but offering little path to broad production use.
Personal GitHub profile README for NVIDIA research scientist in generative 3D and world models
The repository holds a static profile page listing academic credentials at the Spatial Intelligence Lab along with selected publications on LLaMA-Mesh, amortized text-to-3D, and graph metanetworks. Standard personal academic pages serve only narrow professional networking within specialized AI research circles and introduce no reusable technique or broadly applicable artifact.
PyTorch framework for large-scale training of distilled diffusion models across image and video tasks
It implements modular training pipelines for consistency models, distribution matching distillation, self-forcing and related methods with DDP/FSDP2 support on architectures such as EDM, SDXL, Flux and CogVideoX. The codebase aggregates established acceleration techniques into a single extensible toolkit that primarily serves diffusion researchers rather than general developers.
OmniDreams autoregressively generates real-time multi-camera photorealistic video from single RGB frames and HD-map conditioning
It ingests an initial frame plus per-chunk text prompts, coarse HD maps, and trajectory poses to produce video chunks that are fed back as input for long rollouts. The approach refines existing world-model techniques with driving-specific distillation for simulation use cases, limiting adoption to autonomous-vehicle teams rather than general video or robotics workflows.
High-performance inference and serving library for autoregressive video and world models
It supplies specialized runners, multi-GPU pipelines, and configuration tooling that target models such as Wan2.1 variants, Cosmos-Predict, and OmniDreams on H100-class GPUs. The narrow hardware footprint and domain focus on real-time closed-loop simulation limit adoption to specialized teams in robotics and autonomous-vehicle research rather than broad generative-AI use.
Agent skill that turns raw datasets into verifiable multimodal stories with evidence tracing
The skill orchestrates a fixed seven-role pipeline of specialized agents that sequentially research context, profile data, craft narrative, generate visuals, emit tagged HTML, audit layout, and build an interactive evidence viewer. This structured multi-agent workflow is novel for data journalism yet targets a narrow audience of analysts and journalists who already work inside coding agents, limiting mass adoption.
Automated pipeline constructing 879 lifecycle-aware skill poisoning attacks across 12 risk types for LLM agents
Coding agents guided by natural-language harnesses execute target selection, iterative payload design with safety-gated refinement, and reviewer-based quality filtering to produce fixed-payload and self-mutating attack samples that persist across agent sessions. The cross-session mutation technique extends prior poisoning concepts into agent workflows and is therefore most relevant to AI safety and red-teaming teams.
Terminal AI coding agent with multi-provider LLM support and built-in agentic tools
It runs as a Node.js CLI that connects to OpenAI-compatible, Anthropic, or Gemini endpoints through a configurable settings file and exposes Skills plus SubAgents for repository-scale tasks. The approach follows the established pattern of terminal-first LLM coding tools rather than introducing new primitives, limiting its audience to developers already embedded in command-line workflows.
Workflow layer that adds agent teams, structured prompts, and durable state to OpenAI Codex CLI
OMX installs as a global npm package and wraps Codex CLI sessions with predefined commands such as $deep-interview, $ralplan, and $ultragoal that write plans, logs, and artifacts into a .omx directory while optionally launching tmux-based team runtimes. The approach remains a conventional prompt-and-script orchestration layer whose audience stays limited to existing Codex CLI users rather than becoming a general AI-.
Framework for multitask data attribution on multilingual instruction tuning and math reasoning using SFT and GRPO
It implements gradient, kernel mean matching, task vector, and compressed-sensing datamodel attribution methods inside a configurable trainer that runs on the Aya and translated GSM8K datasets. The approach extends existing valuation techniques to post-training regimes rather than pre-training, so adoption is likely limited to research groups already working on data attribution for instruction-tuned models.