Authors: Charles Packer, Sarah Wooders, Kevin Lin, et al.
MemGPT introduces an operating system-inspired approach to managing context in Large Language Models. By treating the LLM as a processor with limited context (like RAM), MemGPT implements virtual context management through a memory hierarchy and intelligent paging between fast and slow storage.
Awareness implements MemGPT's core concepts through a 4-tier memory system:
- Equivalent to MemGPT's "main context"
- Always loaded, contains persona and critical info
- Limited size, high-value content only
- Active scratchpad for current task
- Cleared after each turn
- Similar to MemGPT's short-term working space
- Recent conversation turns with importance scoring
- Paged in/out based on relevance and recency
- MemGPT's "recent context" equivalent
- Permanent storage with vector search
- MemGPT's "external storage"
- Retrieved on-demand via semantic similarity
// Memory paging in Awareness (inspired by MemGPT)
const relevantMemories = await memoryController.archivalMemorySearch({
query: userMessage,
limit: 5,
threshold: 0.7
});
// Page into working context
workingMemory.activeContext = relevantMemories
.map(m => m.content)
.join('\n');Memory architecture foundation for Awareness's 4-tier system
Cognitive architecture framework with 4 memory types (working, episodic, semantic, procedural)
Importance scoring algorithm for memory retrieval, reflection mechanism
Two-phase memory pipeline, conflict resolution (ADD/UPDATE/DELETE/NOOP), graph-enhanced variant
Three-tier consolidation, dialogue distillation, topic-aware consolidation, 117x token reduction