MemGPT: Towards LLMs as Operating Systems

Abstract

MemGPT introduces an operating system-inspired approach to managing context in Large Language Models. By treating the LLM as a processor with limited context (like RAM), MemGPT implements virtual context management through a memory hierarchy and intelligent paging between fast and slow storage.

Key Contributions

1.Virtual context management - extends effective context window beyond physical limits
2.OS-inspired memory hierarchy (main context = RAM, external storage = disk)
3.Heartbeat mechanism for proactive memory management
4.Self-directed memory operations (page in/out decisions by the LLM)
5.Demonstrated 100x context extension on long conversation tasks

Implementation in Awareness

Awareness implements MemGPT's core concepts through a 4-tier memory system:

Core Memory (Always in Context):

- Equivalent to MemGPT's "main context"

- Always loaded, contains persona and critical info

- Limited size, high-value content only

Working Memory (Session-Level):

- Active scratchpad for current task

- Cleared after each turn

- Similar to MemGPT's short-term working space

Episodic Memory (Recent History):

- Recent conversation turns with importance scoring

- Paged in/out based on relevance and recency

- MemGPT's "recent context" equivalent

Archival Memory (Long-Term Storage):

- Permanent storage with vector search

- MemGPT's "external storage"

- Retrieved on-demand via semantic similarity

Code Example

// Memory paging in Awareness (inspired by MemGPT)
const relevantMemories = await memoryController.archivalMemorySearch({
  query: userMessage,
  limit: 5,
  threshold: 0.7
});

// Page into working context
workingMemory.activeContext = relevantMemories
  .map(m => m.content)
  .join('\n');

Usage in Awareness

Memory architecture foundation for Awareness's 4-tier system

Related Papers

CoALA: Cognitive Architectures for Language Agents

Cognitive architecture framework with 4 memory types (working, episodic, semantic, procedural)

Generative Agents: Interactive Simulacra of Human Behavior

Importance scoring algorithm for memory retrieval, reflection mechanism

Mem0: Personalized AI Memory Layer

Two-phase memory pipeline, conflict resolution (ADD/UPDATE/DELETE/NOOP), graph-enhanced variant

LightMem: Lightweight Memory for Long Conversations

Three-tier consolidation, dialogue distillation, topic-aware consolidation, 117x token reduction