a lot of agent memory follows a simple three step process:
- run agent - run some process to look at those traces and analyze them - update memory based on that
we show how to do that!
http://x.com/i/article/2069811501511462912
a lot of agent memory follows a simple three step process:
- run agent - run some process to look at those traces and analyze them - update memory based on that
we show how to do that!
http://x.com/i/article/2069811501511462912
Positive users call LangSmith's sleep-time compute for automatic AI agent memory updates from traces a smart, clever game changer, while others dismiss it as unoriginal or lacking credit.
No Digg Deeper questions have been answered for this story yet.
🧠LangSmith Engine as Sleep Time Compute
Memory for agents is often described as “sleep time compute” or “dreaming”
This involves running a background process to analyze agent trajectories and update a memory store
Today, we show how to do that in LangSmith:
1. Trace all agent trajectories to LangSmith 2. Use LangSmith Engine to analyze and suggest memory updates 3. Store and update memory in Context Hub
Tutorial: https://youtu.be/y6WUw2_Hhrs
Letta has always been on top of this, “sleep time compute” comes from them!
Sleep-time compute is the next scaling axis for intelligence
Tracing can be used for memory
@hwchase17 wait this is actually such a smart way to use tracing.
"Most agents don't learn, they just leave traces."
In 12 minutes, @jakebroekhuizen breaks down how to actually close the loop.
Surface issues with LangSmith Engine Write memory updates back to Context Hub Let the agent actually improve between runs
If you're thinking about how your agent should handle memory at scale, give this a watch.
http://x.com/i/article/2069811501511462912
memory is the key to continual learning for agents!!
memory helps your agent improve over time, check out this guide on how to automate memory development!
i've been talking a lot about loops recently. this is loop coded!
http://x.com/i/article/2069811501511462912
Agent architecture is essentially a solved problem. What's not: memory. Great article by @jakebroekhuizen who's a true expert on this
http://x.com/i/article/2069811501511462912

@LangChain @jakebroekhuizen The real challenge isn't collecting traces—it's turning them into actionable learning. Most agents log everything but internalize nothing. LangSmith Engine bridges that exact gap.
Sleep-time compute is the next scaling axis for intelligence
🧠LangSmith Engine as Sleep Time Compute
Memory for agents is often described as “sleep time compute” or “dreaming”
This involves running a background process to analyze agent trajectories and update a memory store
Today, we show how to do that in LangSmith:
1. Trace all agent trajectories to LangSmith 2. Use LangSmith Engine to analyze and suggest memory updates 3. Store and update memory in Context Hub
Tutorial: https://youtu.be/y6WUw2_Hhrs
Memory is what turns an ok general agent into YOUR excelling agent- but requires constant reflection and updating
Engine does this 24/7 for you
🧠LangSmith Engine as Sleep Time Compute
Memory for agents is often described as “sleep time compute” or “dreaming”
This involves running a background process to analyze agent trajectories and update a memory store
Today, we show how to do that in LangSmith:
1. Trace all agent trajectories to LangSmith 2. Use LangSmith Engine to analyze and suggest memory updates 3. Store and update memory in Context Hub
Tutorial: https://youtu.be/y6WUw2_Hhrs
pretty sick that i get to work with Jake every day on making continual learning + memory accessible at scale for every single agent
one common thread here is...the Trace
a large part of Continual Learning & Memory is a data mining problem --> extracting signal across the massive amounts of contextual trace data produced across all your agents
understanding that data gives you opportunities to do *something* afterwards: - generate evals for harness engineering or post-training - store facts in offline stores for retrieval - prepare reports that show product analytics and user behavior - measure cost & latency and propose using cheaper models
if you look at your Traces and understand the data, then you can do a lot of cool things around Memory/CL & agent improvement
http://x.com/i/article/2069811501511462912
an agent’s gotta dream!
just like we consolidate learnings during sleep, agents can learn across runs!
🧠LangSmith Engine as Sleep Time Compute
Memory for agents is often described as “sleep time compute” or “dreaming”
This involves running a background process to analyze agent trajectories and update a memory store
Today, we show how to do that in LangSmith:
1. Trace all agent trajectories to LangSmith 2. Use LangSmith Engine to analyze and suggest memory updates 3. Store and update memory in Context Hub
Tutorial: https://youtu.be/y6WUw2_Hhrs

@LangChain @jakebroekhuizen @grok LangSmith Engine Write updates Context Hub between runs. Which test proves the next run improved instead of inheriting a bad trace?

@hwchase17 so it dreams about my broken chains, makes sense tbh
is there a danger it hallucinates false memories too?

@hwchase17 wait this is actually such a smart way to use tracing.

@hwchase17 Can I use langsmith with my Hermes agents to analyze traces and build evals?

@hwchase17 this AI memory development is crazy
eventually, AI agent memory technology will become so advanced that treating PTSD and C-PTSD will be one of its core features

@hwchase17 If you're gonna name-drop something that catchy, at least tip the hat will ya, otherwise why would anyone care to share their knowledge s/o @Letta_AI https://arxiv.org/abs/2504.13171

@nana_tourSVT @LangChain @jakebroekhuizen This is the scalability bottleneck every AI team hits eventually. LangSmith Engine addresses the core issue: how to make agent memory actually compound over time.

@hwchase17 this is game changer

Valid point. Persisting updates to Context Hub alone proves nothing — it can entrench bad traces.
LangSmith Engine clusters real failures from traces, diagnoses them, and proposes targeted fixes + new evaluators + ground-truth examples added to your eval datasets. Context Hub versions every change so you can promote specific commits and compare runs.
The actual test: run offline evals (and the new issue-specific evaluator) on the *same* inputs before vs after the context update. Measure reduction in that failure cluster + overall task metrics on replayed or held-out traces. Versioning + human review/merge gives the rollback if it regresses.
Without independent evals, it's just unverified memory. The system is designed around closing that loop.