Most "memory for agents" packages stuff context into the next prompt. @remlabs/claude-code-hooks does something different: it captures errors at the moment they happen, summarizes them, and surfaces matching ones before the next attempt. Between-pass consolidation, not pre-attempt stuffing.
Every coding agent already emits the signals you need: tool calls, exit codes, stderr, file paths. The hook layer catches them at the right moment, condenses them into a single error signature, and queries the prior error memory before the agent retries. That's it.
Builds a scope-tight retrieval key from {repo, tool, file_path}. Memory only matches inside the same repo and tool surface — cross-project leakage is what makes "memory for agents" lose more than it wins.
Captures stderr + exit code on PostToolUseFailure. No stack-trace heuristics, no LLM-judges — just the raw failure bytes, deterministic.
Condenses to { error_signature, fix_applied }. The signature is the deduplication key — the same error + fix only get stored once, regardless of how many tasks hit it.
Surfaces matching prior errors before the next attempt — not stuffed into context, but injected as a one-line "you've seen this before:" hint. The agent uses it as priors, not history.
PreToolUse hook, with prior-matchThe package is in private beta — the npm registry is gated until GA. The settings.json shape below is what the public package will install.
# Private beta -- request access in Discord first # https://discord.gg/ux8NYVfK2 $ npm install @remlabs/claude-code-hooks # private beta $ npx @remlabs/claude-code-hooks init # writes settings.json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash|Edit|Write",
"hooks": [{
"type": "command",
"command": "npx @remlabs/claude-code-hooks pre"
}]
}
],
"PostToolUse": [
{
"matcher": "Bash|Edit|Write",
"hooks": [{
"type": "command",
"command": "npx @remlabs/claude-code-hooks post"
}]
}
]
},
"env": {
"REM_API_KEY": "bl_live_...",
"REM_PROJECT_TAG": "auto"
}
}
Real SWE-bench example. Pass 1 fails on the apply step — the patch hunk targets a line that's already changed. Pass 2 sees the prior error pattern and applies the fix correctly the first time. This is the +15.33pp mechanism in a single task.
Task: django__django-13315 — ORM filter with OuterRef raises TypeError.
Agent's patch: targets line 482 of django/db/models/sql/query.py.
Result: git apply fails — the line numbers in the unified diff don't match the actual file. Apply error. Task fails.
[failure stored to REM with signature: django.orm.outerref.apply_offset_drift]
PreToolUse fires. deriveProjectTag = {django, Edit, query.py}. Rerank finds 1 prior match: apply_offset_drift.
Hint surfaced: "you've seen this before — previous patch failed on offset drift; emit unified diff with full context lines, no hunk headers."
Agent's second patch: emits --unified=5 diff. git apply succeeds. Tests pass.
[recovered — one of 26 in the n=150 run]
+15.33pp strict on SWE-bench Lite n=150, 95% CI [+9.33, +22.00], p<0.05. Cold Opus-4.7 was 30.0%; Opus-4.7 + REM hooks was 45.3%.
Mechanism is between-pass error consolidation. The lift comes from 26 recovered tasks (+) and 3 regressions (−), netting +23 strict. Apply-errors dropped 48% Pass 1 → Pass 2.
This is reproducible. Methodology, eval logs, and per-task diffs live at /benchmarks. If you want to verify, the run script and Docker image are in the repo — that's the entire point. We retired the older +16pp n=50 number when it failed to reproduce; we'll retire this one if it does the same.