The continuity layer for intelligence
Your memories don't just sit there.
They get smarter overnight.
The Dream Engine transforms raw memories into structured knowledge. 9 consolidation strategies. Tournament refinement that kills noise. Runs on cron, on events, or one API call. What every memory system is missing.
The 9 strategies. Named. Explained.
Every other memory system stops at retrieve. The Dream Engine runs nine autonomous consolidation strategies against your namespace — each with a defined purpose, output artifact, and run cadence. Names are stable API identifiers.
strategy: "reflect". The engine compares stored beliefs against downstream outcomes. Output: "Agent believes customers want feature X; actual churn exits cite onboarding friction, not missing features." Gap surfaced explicitly.What the Dream Engine does while you sleep.
At 23:00 local time, the scheduler wakes. The first move is a namespace snapshot — an immutable point-in-time copy of every memory, edge, and confidence score. The snapshot is what the cycle reads from. Your live namespace continues to accept writes without contention. If the cycle fails, the snapshot is discarded and nothing is touched. If it succeeds, the diff lands atomically.
Next, all nine strategies fan out in parallel against the snapshot. Synthesize clusters semantic neighbors. Pattern scans for recurring structures. Contradict cross-references conflicting claims. Reflect compares beliefs against outcomes. Associate links across namespaces. Compress reduces verbose entries. Validate rescores confidence and recency. Evolve feeds superseded hypotheses into tournament rounds. Forecast projects forward from what already exists. Each strategy is a pure function against the snapshot — no cross-talk, no ordering bugs.
The raw outputs come back noisy. That is expected. A dedupe and conflict-resolve pass collapses near-identical insights, runs pairwise contradictions through the A/B/AB tournament with a blind Borda judge, and discards anything that fails the 0.6 novelty threshold. "No change" is a legal outcome. Slop is caught before it ever touches your namespace.
Surviving insights are written back with confidence scores, versioned, and linked to the source memories they were derived from. Every insight has a lineage graph back to source memories — you can always trace a claim to the exact entries that produced it, diff any two cycles, and roll back a bad dream without losing good ones. Knowledge is inspectable, editable, reversible.
The cycle closes by emitting a dream.completed webhook with a structured diff: what was added, merged, refined, flagged, and archived. Downstream automations pick up the signal — Slack posts the morning brief, Notion updates the decision log, the next agent in the pipeline wakes up with a smarter context. You sleep. The namespace gets sharper.
dream.completed webhookWhy this is the moat.
Three structural reasons this is hard to replicate — and why it compounds instead of plateauing.
strategies: [...].Dream Engine nightly → measurable accuracy gain.
Retrieval accuracy on LongMemEval-style queries, measured against the same namespace at set intervals. Every run, the cycle compounds. This is the reason nightly dreaming matters.
| Day | Retrieval accuracy | Progression | Notes |
|---|---|---|---|
| Day 1 | 68% | Baseline — raw memories, zero dream cycles | |
| Day 7 | 78% | After 7 Synthesize runs — redundancy collapsed, confidence calibrated | |
| Day 30 | 94.6% | LongMemEval-comparable — full 9-strategy nightly pipeline active | |
| Day 90 | 96.1% | Extrapolated — Evolve + Forecast compound across 90 cycles |
The Dream Engine consolidation methodology.
Day 1 = stock vector search on raw memories. No Dream Engine runs yet. The retrieval stack is cosine similarity over embedded text — the same primitive every memory library ships. Measured accuracy: 68% recall on the 500 questions in the LongMemEval public set.
Every night, the Dream Engine runs 9 strategies on the full memory corpus. Synthesize consolidates semantic neighbors. Pattern extracts recurring signals. Contradict flags conflicts. Compress reduces verbose entries with zero-loss back-translation. Validate rescores confidence and recency. Evolve runs superseded hypotheses into tournament rounds. Associate links across namespaces. Reflect diffs belief against outcome. Forecast projects forward. The namespace gets denser, not larger.
After 30 nights, the same corpus scores 94.6% (473/500) under a byte-exact upstream GPT-4o judge — the same judge configuration the LongMemEval public leaderboard uses. This is the number we publish. No retrieval model was swapped, no questions were re-selected, no answers were post-edited. The only change is what lives in the memory store.
We publish the full test harness, the 500 questions, and the judge configuration at /benchmarks. Reproducible end-to-end — clone the repo, point it at your namespace, run the scorer, get back a number that matches ours to the question.
847 memories in. 23 insights out.
Raw data goes in. Structured knowledge comes out. Here's what a single cycle produces from a real knowledge base.
Five depth levels. Each run advances.
Consecutive runs go deeper, not wider. The engine tracks what it already processed and advances to the next level automatically.
Set a persona or let it auto-detect.
The persona parameter tunes which patterns the engine prioritizes. Auto-detected from content if omitted.
Five trigger modes.
Cron is the default. But every mode composes with webhooks and automations.
9 strategies. Chain them or run individually.
Each strategy is a discrete operation. Pass strategies: ["synthesize", "validate", "compress"] to chain them in sequence. Each output feeds the next. Or use full_cycle to run all 9.
Out: "Auth failures cluster around token refresh in multi-tab sessions"
Out: "3 of 4 churned accounts mentioned pricing in week 2"
Out: "Senior hires reduced P1 bugs by 40% within 90 days"
Out: 612 memories (28% reduction, zero information loss)
Out: 18 new cross-links to performance, auth, and DX memories
Out: Contradiction flagged: p99 latency up 3x since January
Out: Confidence raised to 0.91 after 4 corroborating data points
Out: "At current rate, v2.0 ships March 18 +/- 12 days"
Out: "No data on competitor pricing since Q3. 4 assumptions unvalidated."
Every output survives an adversarial tournament.
Three candidates compete. A blind judge picks the winner. Knowledge only changes when the replacement is measurably better. "No change" is a valid outcome.
Reason: AB captures domain-specific nuance that neither A nor B provides alone.
How consolidation approaches differ.
A snapshot of how AI memory tools handle knowledge consolidation today.
| Provider | Consolidation | Strategies | Tournament Refinement |
|---|---|---|---|
| Mem0 | None — memories are static after storage | — | — |
| Zep | Temporal knowledge graphs — structure, no synthesis | — | — |
| Membase | Knowledge graph — no consolidation | — | — |
| Thoth | 4-phase dream cycle: dedup → enrichment → inference → decay | 4 (phases) | — |
| OpenClaw | 3-phase dream: light sleep → REM → deep sleep | 3 (phases) | — |
| Hindsight | Observation consolidation — basic automatic synthesis | 1 (consolidate) | — |
| REM Labs Dream Engine | 9 strategies, 5 depth levels, autonomous scheduling | 9 | A/B/AB tournament with blind Borda judging |
Writes back to the memory store. No fine-tuning.
Refined knowledge is written back to the memory store directly. Next cycle inherits the improvement. No GPUs, no retraining, no deployment pipeline.
A scheduled cycle, start to finish.
847 stored memories. Total wall time: 15 minutes. Zero human intervention. Results delivered via webhook.
LongMemEval retrieval accuracy.
We measure against the hardest public benchmark for long-term memory systems.
Tournament refinement: A/B/AB + blind Borda judge
Methodology: byte-exact upstream GPT-4o judge — competitive with the public leaderboard, reproducible
Tournament refinement: None
Backed by: Nous Research partnership
Built-in slop detection.
Diminishing returns detection, similarity thresholds, and rate limits prevent over-processing. The engine stops when there's nothing left to improve.
Two ways to run the Dream Engine.
Let it run automatically every night, or aim it at a specific question and have all 9 strategies work with that goal in mind.
# Directed dream — run the full 9-strategy cycle with a specific task curl -X POST https://remlabs.ai/v1/memory/dream/start \ -H "Authorization: Bearer $REM_API_KEY" \ -d '{"task":"Why is churn up?"}'
Three patterns: single, pipeline, cron.
Run one strategy, chain multiple, or schedule recurring cycles. All return the same result shape.
# Single strategy curl -X POST https://remlabs.ai/v1/dream/run \ -H "Authorization: Bearer $REM_API_KEY" \ -d '{ "strategy": "synthesize" }' # Strategy pipeline — chain in sequence, each output feeds the next curl -X POST https://remlabs.ai/v1/dream/run \ -H "Authorization: Bearer $REM_API_KEY" \ -d '{ "strategies": ["synthesize", "validate", "compress"], "namespace": "support-team" }' # Scheduled — runs nightly, results in your webhook curl -X POST https://remlabs.ai/v1/dream/subscribe \ -H "Authorization: Bearer $REM_API_KEY" \ -d '{ "schedule": "0 2 * * *", "strategies": ["full_cycle"], "webhook": "https://your-app.com/dream-results" }'
Dream is one primitive in a pipeline.
Every REM primitive feeds into every other. Dream doesn't exist in isolation — it's the processing stage between ingestion and action.
# The full pipeline: dump → dream → recall → act # 1. Data comes in npx @remlabs/cli dump chatgpt ~/export.zip npx @remlabs/cli dump slack --channel general # 2. Dream Engine processes it npx @remlabs/cli dream --strategies synthesize,validate,compress # 3. Refined knowledge is now searchable npx @remlabs/cli recall "what patterns exist in customer churn?" # 4. Webhook fires on dream.completed → triggers automation # → Slack post with new insights # → Notion page updated # → Next agent picks up refined context
Why this can't be bolted on later.
Every competitor stores and retrieves. Adding consolidation after the fact means retrofitting the entire data model. Here's why Dream Engine is a structural advantage, not a feature.
dream.completed webhook fires with results. Use it to trigger a Slack post, update Notion, start another dream cycle on a different namespace, or feed results to another agent.Your AI catches itself disagreeing.
The moment two stored memories conflict, the Contradict strategy fires. Nothing is silently overwritten. You get a flagged pair and a resolution suggestion — every time.
Don't want to send memories to our servers?
Self-host the full stack. Same 9 strategies, same tournament refinement, same benchmark numbers — on your own infrastructure.
One API call to start a cycle.
Free tier. No credit card. Import your data, run a dream, see what comes out.