ICML26 PlugMem

Haibin
Academic
4小时前
2 Views
0 Comments
429 Words

Long-term memory is essential for large language model (LLM) agents operating in complex environments, yet existing memory designs are either task-specific and non-transferable, or task-agnostic but less effective due to low task-relevance and context explosion from raw memory retrieval. We propose PLUGMEM, a task-agnostic plugin memory module that can be attached to arbitrary LLM agents without task-specific redesign. Motivated by the fact that decision-relevant information is concentrated as abstract knowledge rather than raw experience, we draw on cognitive science to structure episodic memories into a compact, extensible knowledge-centric memory graph that explicitly represents propositional and prescriptive knowledge. This representation enables efficient memory retrieval and reasoning over task-relevant knowledge, rather than verbose raw trajectories, and departs from other graph-based methods like GraphRAG by treating knowledge as the unit of memory access and organization instead of entities or text chunks. We evaluate PLUGMEM unchanged across three heterogeneous benchmarks (long-horizon conversational question answering, multi-hop knowledge retrieval, and web agent tasks). The results show that PLUGMEM consistently outperforms task-agnostic baselines and exceeds task-specific memory designs, while also achieving the highest information density under a unified information-theoretic analysis. Code and data are available at https://github.com/TIMANgroup/PlugMem.

PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

https://arxiv.org/pdf/2603.03296

构造了三层的图

然后它认为episode，semantic，procedual任务需要不同的memory：

这三个概念是从认知科学来的。

“确实做的早”

“有没有场景任务是混合在一起的？”

“AMA-bench就是混合在一起的，不过它这个方法是找了三个不同的benchmark，有recall的，也有web的。当时ama还没出。我觉得它里面利用memory那设计的挺好的，把原始的文字做了aggregation和去冗余，再用到prompt里面生成的。”

“选的24年benchmark，和qwen2.5”

Appendix

"Specifically, the retrieval module first identifies semantic memories relevant to the query. In the example, semantic memory 669, semantic memroy 141, and semantic memory 146 are all related to wedding, which is the topic of the question. It then locates the source episodic memory corresponding to each retrieved semantic memory and selects those containing a sufficient number of semantic memories retrieved."