[AI Mem0] 源码解读,带你了解 Mem0 的实现

简介: [AI Mem0] 源码解读,带你了解 Mem0 的实现

Mem0 的 CRUD 到底是如何实现的?我们来看下源码。

使用

先来看下,如何使用 Mem0

import os
os.environ["OPENAI_API_KEY"] = "sk-xxx"

from mem0 import Memory

m = Memory()

# 1. Add: Store a memory from any unstructured text
result = m.add("I am working on improving my tennis skills. Suggest some online courses.", user_id="alice", metadata={
   "category": "hobbies"})

# Created memory --> 'Improving her tennis skills.' and 'Looking for online suggestions.'

# 2. Update: update the memory
result = m.update(memory_id=<memory_id_1>, data="Likes to play tennis on weekends")

# Updated memory --> 'Likes to play tennis on weekends.' and 'Looking for online suggestions.'

# 3. Search: search related memories
related_memories = m.search(query="What are Alice's hobbies?", user_id="alice")

# Retrieved memory --> 'Likes to play tennis on weekends'

# 4. Get all memories
all_memories = m.get_all()
memory_id = all_memories[0]["id"] # get a memory_id

# All memory items --> 'Likes to play tennis on weekends.' and 'Looking for online suggestions.'

# 5. Get memory history for a particular memory_id
history = m.history(memory_id=<memory_id_1>)

# Logs corresponding to memory_id_1 --> {'prev_value': 'Working on improving tennis skills and interested in online courses for tennis.', 'new_value': 'Likes to play tennis on weekends' }

MemoryBase

MemoryBase 是一个抽象类,定义了一些接口方法

  • get
  • get_all
  • update
  • delete
  • history
class MemoryBase(ABC):
    @abstractmethod
    def get(self, memory_id):
        """
        Retrieve a memory by ID.

        Args:
            memory_id (str): ID of the memory to retrieve.

        Returns:
            dict: Retrieved memory.
        """
        pass

    @abstractmethod
    def get_all(self):
        """
        List all memories.

        Returns:
            list: List of all memories.
        """
        pass

    @abstractmethod
    def update(self, memory_id, data):
        """
        Update a memory by ID.

        Args:
            memory_id (str): ID of the memory to update.
            data (dict): Data to update the memory with.

        Returns:
            dict: Updated memory.
        """
        pass

    @abstractmethod
    def delete(self, memory_id):
        """
        Delete a memory by ID.

        Args:
            memory_id (str): ID of the memory to delete.
        """
        pass

    @abstractmethod
    def history(self, memory_id):
        """
        Get the history of changes for a memory by ID.

        Args:
            memory_id (str): ID of the memory to get history for.

        Returns:
            list: List of changes for the memory.
        """
        pass

Memory

Memory 实现 MemoryBase 接口

class Memory(MemoryBase):

init

    def __init__(self, config: MemoryConfig = MemoryConfig()):
        self.config = config
        self.embedding_model = EmbedderFactory.create(self.config.embedder.provider)
        # Initialize the appropriate vector store based on the configuration
        vector_store_config = self.config.vector_store.config
        if self.config.vector_store.provider == "qdrant":
            self.vector_store = Qdrant(
                host=vector_store_config.host,
                port=vector_store_config.port,
                path=vector_store_config.path,
                url=vector_store_config.url,
                api_key=vector_store_config.api_key,
            )
        else:
            raise ValueError(
                f"Unsupported vector store type: {self.config.vector_store_type}"
            )

        self.llm = LlmFactory.create(self.config.llm.provider, self.config.llm.config)
        self.db = SQLiteManager(self.config.history_db_path)
        self.collection_name = self.config.collection_name
        self.vector_store.create_col(
            name=self.collection_name, vector_size=self.embedding_model.dims
        )
        self.vector_store.create_col(
            name=self.collection_name, vector_size=self.embedding_model.dims
        )
        capture_event("mem0.init", self)

初始化 embedding_model, vector_store(这里只能是 Qdrant), llm, db, collection_name

add

    def add(
        self,
        data,
        user_id=None,
        agent_id=None,
        run_id=None,
        metadata=None,
        filters=None,
        prompt=None,
    ):
        """
        Create a new memory.

        Args:
            data (str): Data to store in the memory.
            user_id (str, optional): ID of the user creating the memory. Defaults to None.
            agent_id (str, optional): ID of the agent creating the memory. Defaults to None.
            run_id (str, optional): ID of the run creating the memory. Defaults to None.
            metadata (dict, optional): Metadata to store with the memory. Defaults to None.
            filters (dict, optional): Filters to apply to the search. Defaults to None.

        Returns:
            str: ID of the created memory.
        """
  • 将用户 data 发给 llm ,得到 extracted_memories
  • 将用户 data 转成 embeddings
  • vector_store 根据 embeddings search 得到 existing_memories
  • 将新,老 memory 发给 llm 来 merge
  • 调用函数 _create_memory_tool 进行实际操作
    • vector_store insert
    • db add_history

get

    def get(self, memory_id):
        """
        Retrieve a memory by ID.

        Args:
            memory_id (str): ID of the memory to retrieve.

        Returns:
            dict: Retrieved memory.
        """
  • vector_store 根据 memory_id 去 get

get_all

    def get_all(self, user_id=None, agent_id=None, run_id=None, limit=100):
        """
        List all memories.

        Returns:
            list: List of all memories.
        """
  • vector_store 根据 collection_name, filters, limit 调用 list 接口

search

    def search(
        self, query, user_id=None, agent_id=None, run_id=None, limit=100, filters=None
    ):
        """
        Search for memories.

        Args:
            query (str): Query to search for.
            user_id (str, optional): ID of the user to search for. Defaults to None.
            agent_id (str, optional): ID of the agent to search for. Defaults to None.
            run_id (str, optional): ID of the run to search for. Defaults to None.
            limit (int, optional): Limit the number of results. Defaults to 100.
            filters (dict, optional): Filters to apply to the search. Defaults to None.

        Returns:
            list: List of search results.
        """
  • embedding_model 将 query 转 embeddings
  • vector_store 根据 embeddings search

update

    def update(self, memory_id, data):
        """
        Update a memory by ID.

        Args:
            memory_id (str): ID of the memory to update.
            data (dict): Data to update the memory with.

        Returns:
            dict: Updated memory.
        """
  • 调用 _update_memory_tool
    • existing_memory = self.vector_store.get
    • embeddings = self.embedding_model.embed(data)
    • self.vector_store.update
    • self.db.add_history

delete

    def delete(self, memory_id):
        """
        Delete a memory by ID.

        Args:
            memory_id (str): ID of the memory to delete.
        """
  • 调用 _delete_memory_tool
    • existing_memory = self.vector_store.get
    • self.vector_store.delete
    • self.db.add_history

delete_all

    def delete_all(self, user_id=None, agent_id=None, run_id=None):
        """
        Delete all memories.

        Args:
            user_id (str, optional): ID of the user to delete memories for. Defaults to None.
            agent_id (str, optional): ID of the agent to delete memories for. Defaults to None.
            run_id (str, optional): ID of the run to delete memories for. Defaults to None.
        """
  • memories = self.vector_store.list
  • foreach memories
    • _delete_memory_tool

history

    def history(self, memory_id):
        """
        Get the history of changes for a memory by ID.

        Args:
            memory_id (str): ID of the memory to get history for.

        Returns:
            list: List of changes for the memory.
        """
  • self.db.get_history

reset

    def reset(self):
        """
        Reset the memory store.
        """
  • self.vector_store.delete_col
  • self.db.reset()

AnonymousTelemetry

SQLiteManager

  • db 用的是 sqlite3
  • 一个记录历史的表
CREATE TABLE IF NOT EXISTS history (
    id TEXT PRIMARY KEY,
    memory_id TEXT,
    prev_value TEXT,
    new_value TEXT,
    event TEXT,
    timestamp DATETIME,
    is_deleted INTEGER
)

MemoryClient

class MemoryClient:
    """Client for interacting with the Mem0 API.

    This class provides methods to create, retrieve, search, and delete memories
    using the Mem0 API.

    Attributes:
        api_key (str): The API key for authenticating with the Mem0 API.
        host (str): The base URL for the Mem0 API.
        client (httpx.Client): The HTTP client used for making API requests.
    """

Embedding

class EmbeddingBase(ABC):
    @abstractmethod
    def embed(self, text):
        """
        Get the embedding for the given text.

        Args:
            text (str): The text to embed.

        Returns:
            list: The embedding vector.
        """
        pass
  • HuggingFaceEmbedding(model_name="multi-qa-MiniLM-L6-cos-v1")
  • Ollama(model="nomic-embed-text")
  • OpenAI(model="text-embedding-3-small")

LLM

class LLMBase(ABC):
    def __init__(self, config: Optional[BaseLlmConfig] = None):
        """Initialize a base LLM class

        :param config: LLM configuration option class, defaults to None
        :type config: Optional[BaseLlmConfig], optional
        """
        if config is None:
            self.config = BaseLlmConfig()
        else:
            self.config = config

    @abstractmethod
    def generate_response(self, messages):
        """
        Generate a response based on the given messages.

        Args:
            messages (list): List of message dicts containing 'role' and 'content'.

        Returns:
            str: The generated response.
        """
        pass
  • AWSBedrockLLM(anthropic.claude-3-5-sonnet-20240620-v1:0)
  • GroqLLM(llama3-70b-8192)
  • LiteLLM(gpt-4o)
  • OllamaLLM(llama3)
  • OpenAILLM(gpt-4o)
  • TogetherLLM(mistralai/Mixtral-8x7B-Instruct-v0.1)

VectorStore

class VectorStoreBase(ABC):
    @abstractmethod
    def create_col(self, name, vector_size, distance):
        """Create a new collection."""
        pass

    @abstractmethod
    def insert(self, name, vectors, payloads=None, ids=None):
        """Insert vectors into a collection."""
        pass

    @abstractmethod
    def search(self, name, query, limit=5, filters=None):
        """Search for similar vectors."""
        pass

    @abstractmethod
    def delete(self, name, vector_id):
        """Delete a vector by ID."""
        pass

    @abstractmethod
    def update(self, name, vector_id, vector=None, payload=None):
        """Update a vector and its payload."""
        pass

    @abstractmethod
    def get(self, name, vector_id):
        """Retrieve a vector by ID."""
        pass

    @abstractmethod
    def list_cols(self):
        """List all collections."""
        pass

    @abstractmethod
    def delete_col(self, name):
        """Delete a collection."""
        pass

    @abstractmethod
    def col_info(self, name):
        """Get information about a collection."""
        pass
  • 只有 Qdrant 一个实现

总结

  • 核心就是 Memory 类,实现了 MemoryBase 接口
  • 通过 embedding_model 来处理文本
  • 通过 vector_store 存储 embedding
  • 通过 llm 处理数据
  • 通过 db 记录 Memory 的历史

相关文章
|
5月前
|
人工智能 数据安全/隐私保护 异构计算
桌面版exe安装和Python命令行安装2种方法详细讲解图片去水印AI源码私有化部署Lama-Cleaner安装使用方法-优雅草卓伊凡
桌面版exe安装和Python命令行安装2种方法详细讲解图片去水印AI源码私有化部署Lama-Cleaner安装使用方法-优雅草卓伊凡
652 8
桌面版exe安装和Python命令行安装2种方法详细讲解图片去水印AI源码私有化部署Lama-Cleaner安装使用方法-优雅草卓伊凡
|
7月前
|
人工智能 自然语言处理 搜索推荐
从输入指令到代码落地:Cline AI 源码浅析
文章揭示了Cline如何将简单的自然语言指令转化为具体的编程任务,并执行相应的代码修改或生成操作。
910 18
从输入指令到代码落地:Cline AI 源码浅析
|
7月前
|
机器学习/深度学习 人工智能 数据可视化
基于YOLOv8的AI虫子种类识别项目|完整源码数据集+PyQt5界面+完整训练流程+开箱即用!
本项目基于YOLOv8与PyQt5开发,实现虫子种类识别,支持图片、视频、摄像头等多种输入方式,具备完整训练与部署流程,开箱即用,附带数据集与源码,适合快速搭建高精度昆虫识别系统。
基于YOLOv8的AI虫子种类识别项目|完整源码数据集+PyQt5界面+完整训练流程+开箱即用!
|
人工智能
AI对话网站一键生成系统源码
可以添加进自己的工具箱,也可以嵌入自己博客的页面中,引流效果杠杠的,新拟态设计风格,有能力的大佬可以进行二开,仅提供学习,用户可输入网站名称、AI默认的开场白、AI头像昵称、AI网站中引流的你的网站等等内容,所有生成的网页全部保存到你的服务器上
294 27
AI对话网站一键生成系统源码
|
人工智能
[AI Mem0] 快速开始:智能记忆管理,让你的数据活起来!
[AI Mem0] 快速开始:智能记忆管理,让你的数据活起来!
|
人工智能 自然语言处理 搜索推荐
[AI Mem0 Platform] 快速开始,为您的AI应用注入长期记忆和个性化能力!
[AI Mem0 Platform] 快速开始,为您的AI应用注入长期记忆和个性化能力!
1236 0
|
12月前
|
人工智能 算法 搜索推荐
AI大模型发展对语音直播交友系统源码开发搭建的影响
近年来,AI大模型技术的迅猛发展深刻影响了语音直播交友系统的开发与应用。本文探讨了AI大模型如何提升语音交互的自然流畅性、内容审核的精准度、个性化推荐的智能性以及虚拟主播的表现力,并分析其对开发流程和用户体验的变革。同时,展望了多模态交互、情感陪伴及元宇宙社交等未来发展方向,指出在把握机遇的同时需应对数据安全、算法偏见等挑战,以实现更智能、安全、有趣的语音直播交友平台。
|
12月前
|
机器学习/深度学习 自然语言处理 算法
生成式 AI 大语言模型(LLMs)核心算法及源码解析:预训练篇
生成式 AI 大语言模型(LLMs)核心算法及源码解析:预训练篇
3271 1
|
存储 人工智能 SEO
全开源免费AI网址导航网站源码
Aigotools 可以帮助用户快速创建和管理导航站点,内置站点管理和自动收录功能,同时提供国际化、SEO、多种图片存储方案。让用户可以快速部署上线自己的导航站。
1542 1