AI Development 3 min read

AI Memory with Mem0: Give Your Chatbot Long-Term Memory Across Sessions

LLMs forget everything when the context window closes. Mem0 gives AI applications persistent, intelligent memory — automatically extracting facts from conversations, storing them semantically, and injecting relevant context into future sessions.

G
Gurpreet Singh
March 26, 2026

The Stateless Problem

Every LLM conversation starts from zero. The model has no memory of what you discussed yesterday, last week, or six months ago. For a personal AI assistant or a business chatbot that serves returning customers, this statelessness is a fundamental UX failure. Users have to re-explain their context every session. Chatbots ask the same onboarding questions repeatedly. AI agents lose their understanding of ongoing projects the moment the context window closes.

The naive solution — appending the entire conversation history to every prompt — fails at scale. A user with 100 sessions generates hundreds of thousands of tokens of history. Injecting all of it into every new conversation is prohibitively expensive and fills the context window with irrelevant historical detail.

What's needed is intelligent memory: a system that automatically extracts the important facts from conversations, stores them in a searchable memory store, and retrieves only the relevant memories for each new session. This is what Mem0 provides.

How Mem0 Works

Mem0 (pronounced "mem-zero") is an open-source memory layer for AI applications. It sits between your application and the LLM, automatically managing what to remember and what to recall.

The core mechanism has three phases:

  • Memory Extraction: After each conversation turn, Mem0 passes the exchange to an LLM with a prompt designed to extract meaningful facts. From "I'm building a SaaS product in Laravel for the healthcare industry, targeting small clinics", Mem0 extracts: [{"fact": "building SaaS product", "category": "project"}, {"fact": "using Laravel", "category": "tech_stack"}, {"fact": "healthcare industry", "category": "domain"}, {"fact": "targeting small clinics", "category": "customer_segment"}]
  • Memory Storage: Extracted facts are embedded (using OpenAI or a local embedding model) and stored in a vector database (Qdrant, pgvector, or Mem0's managed cloud). Facts are associated with a user_id, agent_id, and timestamp.
  • Memory Retrieval: At the start of each new session, Mem0 embeds the incoming query and retrieves the most semantically relevant memories. Only the top-K relevant facts (not all historical facts) are injected into the system prompt.

Installation and Basic Setup

pip install mem0ai

from mem0 import Memory
from openai import OpenAI

# Configure Mem0 with your vector store and LLM
config = {
    "llm": {
        "provider": "openai",
        "config": {
            "model": "gpt-4o-mini",  # Use a cheap model for memory extraction
            "temperature": 0,
        }
    },
    "embedder": {
        "provider": "openai",
        "config": {"model": "text-embedding-3-small"}
    },
    "vector_store": {
        "provider": "qdrant",
        "config": {
            "collection_name": "ai_memories",
            "host": "localhost",
            "port": 6333,
        }
    }
}

memory = Memory.from_config(config)
openai_client = OpenAI()

Adding Memory to a Chatbot

def chat_with_memory(user_message: str, user_id: str) -> str:
    # Step 1: Retrieve relevant memories for this user + query
    relevant_memories = memory.search(
        query=user_message,
        user_id=user_id,
        limit=5  # Top 5 most relevant facts
    )

    # Step 2: Format memories as context
    memory_context = ""
    if relevant_memories["results"]:
        facts = [m["memory"] for m in relevant_memories["results"]]
        memory_context = "What you remember about this user:\n" + "\n".join(f"- {f}" for f in facts)

    # Step 3: Build prompt with memory context
    messages = [
        {
            "role": "system",
            "content": f"""You are a helpful AI assistant.
{memory_context}

Use the above memories to personalise your response.
Do not ask for information you already know."""
        },
        {"role": "user", "content": user_message}
    ]

    # Step 4: Get LLM response
    response = openai_client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    assistant_message = response.choices[0].message.content

    # Step 5: Store the conversation in memory (async in production)
    memory.add(
        messages=[
            {"role": "user", "content": user_message},
            {"role": "assistant", "content": assistant_message}
        ],
        user_id=user_id
    )

    return assistant_message

Memory Types: User, Agent, and Session

Mem0 supports three memory scopes:

  • User memory: Facts about a specific user — preferences, context, history. Persists across all sessions. user_id="user_123"
  • Agent memory: Facts the agent has learned globally — product knowledge, common patterns, learned preferences. Shared across all users. agent_id="support_bot"
  • Session memory: Facts relevant only to the current conversation. Scoped to a run_id. Effectively a smarter context window that semantic-searches its own history.

Integrating with Laravel and a Chatbot API

// In your Laravel chatbot controller
class ChatController extends Controller
{
    public function message(Request $request)
    {
        $userId = auth()->id();
        $message = $request->input("message");

        // Call your Python FastAPI memory service
        $response = Http::post("http://memory-service:8000/chat", [
            "user_id" => (string) $userId,
            "message" => $message,
        ]);

        return response()->json([
            "reply" => $response->json("reply"),
            "memories_used" => $response->json("memories_used"),
        ]);
    }
}

Memory Management: Update, Delete, Inspect

import mem0

# View all memories for a user
memories = memory.get_all(user_id="user_123")
for m in memories["results"]:
    print(f"[{m[`id`]}] {m[`memory`]} (created: {m[`created_at`]})")

# Update a specific memory (e.g., user changed their stack)
memory.update(memory_id="mem_abc", data="Now using Next.js instead of Vue.js")

# Delete a specific memory
memory.delete(memory_id="mem_abc")

# Delete all memories for a user (GDPR compliance)
memory.delete_all(user_id="user_123")

The delete_all capability is important for GDPR compliance — "right to be forgotten" requests can be fulfilled with a single API call that removes all stored facts about a user.

Real-World Impact: Before and After

Without memory (standard chatbot):
Session 5, user returns: "Can you help me with my Laravel project?"
Bot: "Sure! What are you building? What's your tech stack? What's your experience level?"

With Mem0 (memory-enabled):
Session 5, user returns: "Can you help me with my Laravel project?"
Bot: "Of course! How's the SaaS platform for healthcare clinics coming along? Last time we were working on the multi-tenant architecture. What do you need help with today?"

The difference in user experience is dramatic. The chatbot feels like a knowledgeable colleague, not a stranger. In production CRM chatbots, memory-enabled agents see 40% higher session engagement and significantly lower re-explanation overhead — measurable business value from a relatively simple architectural addition.

#Mem0 #AI Memory #Chatbot #RAG #Vector Database #LangChain #OpenAI #Persistent Memory
G
Gurpreet Singh

Senior Full Stack Developer — Laravel, Vue.js, Nuxt.js & AI. Available for freelance projects.

Hire Me for Your Project

Related Articles