GyriQAI Research Note · Findings of ACL 2026
SYNAPSE: Episodic-Semantic Memory for Long-Horizon LLM Agents
Your RAG agent may not be forgetting. It may simply fail to connect the right memory to the current question. This note summarizes SYNAPSE, our ACL 2026 Findings paper on long-term memory for LLM agents. The central claim is modest but important: memory-augmented agents should retrieve not only semantically similar text, but also structurally related evidence across time, entities, and events. SYNAPSE has already been used in EZCollegeApp, GyriQAI’s product for long-horizon U.S. undergraduate application planning, where an AI counselor needs to preserve evolving student context, document status, deadlines, and the reasoning behind earlier recommendations.
Lexical and semantic anchors inject signal; activation then propagates through the memory graph before final context selection.
Best weighted average on LoCoMo among compared memory systems.
Improvement over A-Mem on the same GPT-4o-mini setting.
Compared with full-context methods on the LoCoMo evaluation.
Core thesis
Long-term agents need memory that preserves relationships, not just similar text.
Vector search asks: “What past text looks like this question?” SYNAPSE asks a complementary question: what past evidence is connected to this situation?
Similarity-based retrieval
- Embed the user query.
- Retrieve nearby chunks.
- Stuff them into context.
- Rely on the model to infer the missing bridge.
This is effective for direct lookup, but weaker when the answer depends on causal, temporal, or transitive links.
SYNAPSE retrieval
- Build an episodic-semantic graph.
- Inject activation from lexical and semantic anchors.
- Let energy spread through temporal, abstraction, and association edges.
- Retrieve the subgraph that is structurally relevant.
The memory still uses semantic similarity, but it also lets graph structure contribute to relevance.
The problem
A common failure mode: contextual isolation
Imagine an assistant that has worked with a user for weeks. The user asks:
Why am I feeling anxious today?
A vector memory will retrieve messages close to “anxious”: stress, sleep, pressure, maybe recent complaints. But suppose the real cause was a scheduling conflict mentioned three weeks ago. That note never used the word “anxiety.” It is not close in embedding space. The agent misses it and gives a plausible, incomplete answer.
That is contextual isolation: the memory exists, but the retrieval system cannot connect it to the moment where it matters. SYNAPSE is designed for this class of failure.
Technical innovation
Three mechanisms behind SYNAPSE
Unified graph
Dialogue turns become episodic nodes; extracted entities, goals, events, and preferences become semantic nodes. Temporal and semantic structure are represented in one graph.
Spreading activation
A query anchors the graph through BM25 and dense retrieval. Activation then propagates along temporal, abstraction, and association edges.
Uncertainty gating
If activation is too weak, the system can reject the memory claim before generation. This helps reduce memory hallucination.
The bridge-node effect
If “Mark” appears in both a ski-trip conversation and a later dating conversation, Mark becomes the bridge. A question about “the guy from the ski trip” can activate the trip, then Mark, then the later relationship outcome. Pure vector search often misses that path.
Evidence
Results on long-horizon conversational memory
| Method | Multi-Hop | Temporal | Open-Domain | Single-Hop | Weighted F1 |
|---|---|---|---|---|---|
| A-Mem | 27.0 | 45.9 | 12.1 | 44.7 | 33.3 |
| AriGraph | 28.5 | 43.2 | 14.5 | 45.1 | 33.7 |
| MemoryOS | 35.3 | 41.2 | 20.0 | 48.6 | 38.0 |
| Zep | 35.5 | 48.5 | 23.1 | 48.0 | 39.7 |
| SYNAPSE | 35.7 | 50.1 | 25.9 | 48.9 | 40.5 |
Low-similarity subset: when evidence is deliberately far from the question in embedding space, A-Mem drops by more than 50%; SYNAPSE drops by less than 8%. This supports the hypothesis that graph structure provides a complementary retrieval signal.
96.6
Adversarial F1. The gating layer helps the system reject questions about absent memories.
1.9s
Average latency on GPT-4o-mini in the reported efficiency evaluation.
$0.24
Estimated cost per 1K queries under the paper’s GPT-4o-mini cost profile.
From paper to product
Why this matters for GyriQAI
GyriQAI is building AI products for long-horizon decision support. Our first product direction, EZCollegeApp, focuses on U.S. undergraduate application planning for international students.
Admissions guidance is a memory-heavy workflow. A useful AI counselor must track a student’s academic profile, school list, document status, essay revisions, deadlines, constraints, and the reasoning behind prior recommendations. It also needs restraint: it should not fabricate experiences, pretend forecasts are guarantees, or replace official admissions requirements.
This is where SYNAPSE informs our product thinking. We are interested in memory-native AI products: systems that preserve evolving user context, retrieve grounded evidence at the right time, and know when the evidence is not available.
Product principles
Design implications
References