### HTTP Service - Starting the HTTP server

Source: https://context7.com/modelscope/memoryscope/llms.txt

Instructions on how to start the MemoryScope HTTP server with specified configurations.

```APIDOC
## HTTP Service

### Starting the HTTP server

Starts the MemoryScope HTTP server with custom configurations for backend, port, LLM, embedding model, and vector store.

### Example
```bash
reme \
  backend=http \
  http.port=8002 \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=local
```
```

--------------------------------

### Read File Example

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cli/quick_start_en.md

Example of using the `read` tool to access a specific daily log file.

```bash
Read `memory/2025-02-13.md`
```

--------------------------------

### Complete Configuration Example (Embedding & Vector Store)

Source: https://github.com/modelscope/memoryscope/blob/main/docs/vector_store_api_guide.md

A comprehensive example combining embedding model and vector store configurations. This example uses an OpenAI-compatible embedding model and an Elasticsearch vector store.

```yaml
# Embedding model configuration
embedding_model:
  default:
    backend: openai_compatible
    model_name: text-embedding-v4
    params:
      dimensions: 1024

# Vector store configuration
vector_store:
  default:
    backend: elasticsearch
    embedding_model: default
    params:
      hosts: "http://localhost:9200"
```

```shell
# Embedding model configuration
embedding_model.default.backend=openai_compatible
embedding_model.default.model_name=text-embedding-v4
embedding_model.default.params.dimensions=1024

# Vector store configuration
vector_store.default.backend=elasticsearch
vector_store.default.params.hosts=http://localhost:9200
```

--------------------------------

### Install ReMeLight from Source

Source: https://github.com/modelscope/memoryscope/blob/main/README.md

Clone the repository and install ReMeLight with its light dependencies. Ensure you are in the ReMe directory after cloning.

```bash
git clone https://github.com/agentscope-ai/ReMe.git
cd ReMe
pip install -e ".[light]"
```

--------------------------------

### Install BFCL Package

Source: https://github.com/modelscope/memoryscope/blob/main/benchmark/bfcl/quickstart.md

Navigate to the BFCL directory and install the package in editable mode. Then, install requirements and copy the dataset.

```bash
cd berkeley-function-call-leaderboard
pip install -e .
cd ../..
pip install -r requirements.txt
cp -r gorilla/berkeley-function-call-leaderboard/bfcl_eval/data ./
```

--------------------------------

### Install Benchmark Dependencies

Source: https://github.com/modelscope/memoryscope/blob/main/docs/tool_memory/tool_bench.md

Installs the necessary Python packages for running the benchmark. Ensure you have pip available.

```bash
pip install requests python-dotenv loguru tabulate
```

--------------------------------

### Install AppWorld Dependencies

Source: https://github.com/modelscope/memoryscope/blob/main/benchmark/appworld/quickstart.md

Install the required Python packages for AppWorld by referencing the 'requirements.txt' file. Then, install AppWorld itself and download the necessary dataset.

```bash
pip install -r requirements.txt
pip install appworld
appworld install
appworld download data
```

--------------------------------

### Install ReMe CLI via PyPI

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cli/quick_start_en.md

Install the ReMe AI package using pip. This is the recommended installation method.

```bash
pip install reme-ai==0.3.0.0b1
```

--------------------------------

### Install Gymnasium

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cookbook/frozenlake/quickstart.md

Install the Gymnasium library, which is required for the FrozenLake environment.

```bash
pip install gymnasium
```

--------------------------------

### Configure Environment Variables

Source: https://github.com/modelscope/memoryscope/blob/main/docs/installation.md

Set up your API keys and base URLs for LLM and embedding services by copying the example environment file and modifying its contents.

```bash
FLOW_LLM_API_KEY=sk-xxxx
FLOW_LLM_BASE_URL=https://xxxx/v1
FLOW_EMBEDDING_API_KEY=sk-xxxx
FLOW_EMBEDDING_BASE_URL=https://xxxx/v1
```

--------------------------------

### Example Usage: GrepOp and ReadFileOp Combination

Source: https://github.com/modelscope/memoryscope/blob/main/docs/work_memory/message_reload_ops.md

This example demonstrates how to integrate GrepOp and ReadFileOp within an agentic retrieval workflow. It highlights system prompt configuration and practical usage scenarios for message offload and reload.

```python
from modelscope.agent.tools.agentic_retrieve_op import AgenticRetrieveOp

# Refer to test_agentic_retrieve_op.py for a complete working example
# This test file demonstrates:
# - How to configure the system prompt to guide AI in using Grep and ReadFile operations
# - Real-world usage scenarios with message offload and reload
# - Proper parameter settings for AgenticRetrieveOp with working memory
# - Best practices for combining these operations in a retrieval workflow

# Example placeholder (actual implementation is in the referenced test file):
# retrieve_op = AgenticRetrieveOp()
# results = retrieve_op.call(query="find relevant information about X")

```

--------------------------------

### Install ReMe AI Package

Source: https://context7.com/modelscope/memoryscope/llms.txt

Install the full ReMe AI package from PyPI. For source installation with ReMeLight dependencies, clone the repository and install with specific extras.

```bash
pip install reme-ai
```

```bash
git clone https://github.com/agentscope-ai/ReMe.git
cd ReMe
pip install -e ".[light]"
```

--------------------------------

### Python MCP Client Setup and Connection

Source: https://github.com/modelscope/memoryscope/blob/main/docs/mcp_quick_start.md

Install the fastmcp and dotenv packages, then use this Python code to set up and connect to an MCP server using SSE transport. Ensure your environment variables are loaded.

```python
import asyncio
from fastmcp import Client
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# MCP server URL (for SSE transport)
MCP_URL = "http://0.0.0.0:8002/sse/"
WORKSPACE_ID = "my_workspace"


async def main():
    async with Client(MCP_URL) as client:
        # Your MCP operations will go here
        pass


if __name__ == "__main__":
    asyncio.run(main())
```

--------------------------------

### Install Reme AI from Source

Source: https://github.com/modelscope/memoryscope/blob/main/docs/installation.md

Clone the repository and install Reme AI locally. This is useful for development or when you need the latest unreleased features.

```bash
git clone https://github.com/agentscope-ai/ReMe.git
cd ReMe
pip install .
```

--------------------------------

### Setup API Client

Source: https://github.com/modelscope/memoryscope/blob/main/docs/personal_memory/personal_memory.md

Initializes the asynchronous HTTP client session and sets the base URL for API interactions. Ensure the API is running before executing.

```python
import asyncio
import json
import aiohttp

# API base URL (default is http://0.0.0.0:8002)
base_url = "http://0.0.0.0:8002"
workspace_id = "personal_memory_demo"
```

--------------------------------

### Complete Task Memory Workflow Example

Source: https://github.com/modelscope/memoryscope/blob/main/docs/task_memory/task_memory.md

Demonstrates a full cycle of generating and retrieving task memories to augment agent responses.

```python
def run_agent_with_memory(query_first, query_second):
    # Run agent with second query to build initial memories
    messages = run_agent(query=query_second)

    # Summarize conversation to create memories
    requests.post(
        url=f"{BASE_URL}summary_task_memory",
        json={
            "workspace_id": WORKSPACE_ID,
            "trajectories": [
                {"messages": messages, "score": 1.0}
            ]
        }
    )

    # Retrieve relevant memories for the first query
    response = requests.post(
        url=f"{BASE_URL}retrieve_task_memory",
        json={
            "workspace_id": WORKSPACE_ID,
            "query": query_first
        }
    )
    retrieved_memory = response.json().get("answer", "")

    # Run agent with first query augmented with retrieved memories
    augmented_query = f"{retrieved_memory}\n\nUser Question:\n{query_first}"
    return run_agent(query=augmented_query)
```

--------------------------------

### Tool Memory Lifecycle Example

Source: https://github.com/modelscope/memoryscope/blob/main/docs/tool_memory/tool_memory.md

Demonstrates the typical workflow for managing tool memory, including adding results, summarizing, retrieving guidelines, and using them for improved execution.

```python
# Step 1: Add tool call results (accumulate history)
add_tool_call_results([
    {"tool_name": "web_search", "input": {...}, "output": "...", "success": True},
    {"tool_name": "web_search", "input": {...}, "output": "...", "success": False},
    # ... more results
])

# Step 2: Generate usage guidelines (periodic)
summarize_tool_memory("web_search")

# Step 3: Retrieve guidelines before next use
memory = retrieve_tool_memory("web_search")
# Returns:
# "For web_search tool:
#  - Use max_results=5-20 for optimal performance
#  - Avoid generic queries, be specific
#  - Language parameter 'en' has 95% success rate
#  Statistics: 83% success, avg 2.3s, avg 150 tokens"

# Step 4: Agent uses guidelines for better execution
execute_with_recommended_parameters()
```

--------------------------------

### Install ReMe CLI from Source

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cli/quick_start_en.md

Clone the ReMe repository and install it locally using pip. This method is useful for development or if you need the latest unreleased features.

```bash
git clone https://github.com/agentscope-ai/ReMe.git
cd ReMe
pip install -e .
```

--------------------------------

### Set up ReMe Environment

Source: https://github.com/modelscope/memoryscope/blob/main/benchmark/longmemeval/quickstart.md

Installs the ReMe environment using conda and pip. Ensure you have conda installed and activated.

```bash
conda create -p ./reme-env python==3.12
conda activate ./reme-env

pip install .
```

--------------------------------

### Memory Search Example

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cli/quick_start_en.md

Example of using `memory_search` for a semantic search query.

```bash
"previous discussion about deployment"
```

--------------------------------

### Start ReMe API Server

Source: https://github.com/modelscope/memoryscope/blob/main/docs/tool_memory/tool_bench.md

Starts the ReMe API server on a specified port. This is required before executing the benchmark.

```bash
# Start ReMe API server
python reme_ai/app.py --port 8002
```

--------------------------------

### Start ReMe Service

Source: https://github.com/modelscope/memoryscope/blob/main/benchmark/bfcl/quickstart.md

Launch the ReMe service with specified backend, port, model names, and vector store configuration.

```bash
reme2 \
  backend=http \
  http.port=8002 \
  llms.default.model_name=qwen3-8b \
  embedding_models.default.model_name=text-embedding-v4 \
  vector_stores.default.backend=local \
  vector_stores.default.collection_name=bfcl
```

--------------------------------

### Install ReMe and Dependencies

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cookbook/frozenlake/quickstart.md

Install ReMe and its dependencies, optionally within a new virtual environment, to enable memory library functionality.

```bash
cd ../..

conda create -p ./reme-env python==3.10
conda activate ./reme-env

pip install .
```

--------------------------------

### Configure Vector Store via Command Line

Source: https://github.com/modelscope/memoryscope/blob/main/docs/vector_store_api_guide.md

Example of setting vector store configuration parameters using command-line arguments. This is useful for overriding default settings or providing backend-specific parameters.

```shell
vector_store.default.backend=<backend_name>
vector_store.default.params.<param_name>=<param_value>
```

--------------------------------

### HTTP Service Startup

Source: https://github.com/modelscope/memoryscope/blob/main/docs/quick_start.md

This command starts the MemoryScope HTTP service with specified configurations for backend, port, LLM model, embedding model, and vector store.

```APIDOC
## HTTP Service Startup

```bash
reme \
  backend=http \
  http.port=8002 \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=local
```
```

--------------------------------

### Start HTTP Service

Source: https://github.com/modelscope/memoryscope/blob/main/docs/quick_start.md

Starts the MemoryScope service with an HTTP backend. Configure the port, LLM, embedding model, and vector store.

```bash
reme \
  backend=http \
  http.port=8002 \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=local
```

--------------------------------

### Initialize ReMeLight File-Based Memory

Source: https://context7.com/modelscope/memoryscope/llms.txt

Initialize a ReMeLight instance, setting up the working directory and configuring default models and search parameters. Ensure to await the start() method before use and close() when done.

```python
import asyncio
from reme.reme_light import ReMeLight

async def main():
    reme = ReMeLight(
        working_dir=".reme",                          # root directory for all data
        default_as_llm_config={"model_name": "qwen3.5-35b-a3b"},
        default_embedding_model_config={"model_name": "text-embedding-v4"},
        default_file_store_config={"fts_enabled": True, "vector_enabled": True},
        vector_weight=0.7,          # weight for vector search in hybrid retrieval
        candidate_multiplier=3.0,   # retrieve 3x candidates before re-ranking
        enable_load_env=True,       # load .env automatically
    )
    await reme.start()
    # ... use reme ...
    await reme.close()

asyncio.run(main())
```

--------------------------------

### Install Pre-commit Hooks

Source: https://github.com/modelscope/memoryscope/blob/main/docs/contribution.md

Install pre-commit to automatically check and format your code before committing. This ensures code quality and consistency.

```bash
pip install pre-commit
```

```bash
pre-commit install
```

--------------------------------

### Set Up ReMe Environment

Source: https://github.com/modelscope/memoryscope/blob/main/benchmark/appworld/quickstart.md

Navigate to the project root, create a new conda environment for ReMe, activate it, and install ReMe. This environment is necessary for memory library functionality.

```bash
# Go back to the project root
cd ../..

# Create ReMe environment
conda create -p ./reme-env python==3.12
conda activate ./reme-env

# Install ReMe
pip install .
```

--------------------------------

### Install Reme AI from PyPI

Source: https://github.com/modelscope/memoryscope/blob/main/docs/installation.md

Use this command to install the latest stable version of Reme AI from the Python Package Index.

```bash
pip install reme-ai
```

--------------------------------

### Get tool usage guidelines before invocation

Source: https://context7.com/modelscope/memoryscope/llms.txt

Retrieve pre-generated usage guidelines for a tool before invoking it. This can help in optimizing tool calls and avoiding common pitfalls.

```bash
curl -X POST http://localhost:8002/retrieve_tool_memory \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "tool_workspace",
    "tool_names": "web_search"
  }'
```

--------------------------------

### Start MCP server with STDIO transport

Source: https://context7.com/modelscope/memoryscope/llms.txt

Configure and start the MCP server using STDIO transport, suitable for local execution and most MCP clients. Ensure the backend, transport, and model names are correctly specified.

```bash
# STDIO transport (for Claude Desktop and most MCP clients)
reme \
  backend=mcp \
  mcp.transport=stdio \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=local
```

--------------------------------

### Example Tool Descriptions

Source: https://github.com/modelscope/memoryscope/blob/main/docs/tool_memory/tool_memory.md

Illustrates ambiguous tool descriptions that highlight the need for richer context beyond static documentation.

```text
Tool A: "Search the web for information"
Tool B: "Perform web searches with customizable parameters"
Tool C: "Query search engines and return results"
```

--------------------------------

### Start MCP server with SSE transport

Source: https://context7.com/modelscope/memoryscope/llms.txt

Configure and start the MCP server using SSE transport for HTTP-based clients. Specify the port, backend, transport, and model names.

```bash
# SSE transport (for HTTP-based MCP clients)
reme \
  backend=mcp \
  mcp.transport=sse \
  http_service.port=8001 \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=local
```

--------------------------------

### MCP Server Support

Source: https://github.com/modelscope/memoryscope/blob/main/docs/quick_start.md

This command starts the MemoryScope MCP server with specified configurations for transport, LLM model, embedding model, and vector store.

```APIDOC
## MCP Server Support

```bash
reme \
  backend=mcp \
  mcp.transport=stdio \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=local
```
```

--------------------------------

### Initialize and Use ReMe for Memory Management in Python

Source: https://github.com/modelscope/memoryscope/blob/main/README.md

Demonstrates how to initialize the ReMe client with various configurations for LLM, embedding models, and vector stores. It covers common memory operations like summarizing, retrieving, adding, getting, updating, listing, and deleting memories for a user. Ensure proper async handling and resource closure.

```python
import asyncio

from reme import ReMe


async def main():
    # Initialize ReMe
    reme = ReMe(
        working_dir=".reme",
        default_llm_config={
            "backend": "openai",
            "model_name": "qwen3.5-plus",
        },
        default_embedding_model_config={
            "backend": "openai",
            "model_name": "text-embedding-v4",
            "dimensions": 1024,
        },
        default_vector_store_config={
            "backend": "local",  # Supports local/chroma/qdrant/elasticsearch/obvec/zvec/hologres
        },
    )
    await reme.start()

    messages = [
        {"role": "user", "content": "Help me write a Python script", "time_created": "2026-02-28 10:00:00"},
        {"role": "assistant", "content": "Sure, I'll help you with that.", "time_created": "2026-02-28 10:00:05"},
    ]

    # 1. Summarize memories from conversation (automatically extract user preferences, task experience, etc.)
    result = await reme.summarize_memory(
        messages=messages,
        user_name="alice",  # Personal memory
        # task_name="code_writing",  # Procedural memory
    )
    print(f"Summary result: {result}")

    # 2. Retrieve related memories
    memories = await reme.retrieve_memory(
        query="Python programming",
        user_name="alice",
        # task_name="code_writing",
    )
    print(f"Retrieved memories: {memories}")

    # 3. Manually add a memory
    memory_node = await reme.add_memory(
        memory_content="The user prefers concise code style.",
        user_name="alice",
    )
    print(f"Added memory: {memory_node}")
    memory_id = memory_node.memory_id

    # 4. Get a single memory by ID
    fetched_memory = await reme.get_memory(memory_id=memory_id)
    print(f"Fetched memory: {fetched_memory}")

    # 5. Update memory content
    updated_memory = await reme.update_memory(
        memory_id=memory_id,
        user_name="alice",
        memory_content="The user prefers concise code with comments.",
    )
    print(f"Updated memory: {updated_memory}")

    # 6. List all memories for the user (supports filtering and sorting)
    all_memories = await reme.list_memory(
        user_name="alice",
        limit=10,
        sort_key="time_created",
        reverse=True,
    )
    print(f"User memory list: {all_memories}")

    # 7. Delete a specific memory
    await reme.delete_memory(memory_id=memory_id)
    print(f"Deleted memory: {memory_id}")

    # 8. Delete all memories (use with care)
    # await reme.delete_all()

    await reme.close()


if __name__ == "__main__":
    asyncio.run(main())

```

--------------------------------

### Python Model Configuration Example

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cookbook/working/quick_start.md

Configure an OpenAI-compatible LLM by setting environment variables for API keys and base URLs. The model name is specified directly in the Python script.

```python
model_name = "qwen3-coder-30b-a3b-instruct"
agent = ReactAgent(model_name=model_name, max_steps=50)
```

--------------------------------

### Python ReMeLight Usage Example

Source: https://github.com/modelscope/memoryscope/blob/main/README.md

Demonstrates the initialization and various memory management operations of ReMeLight, including context checking, memory compaction, tool result compaction, pre-reasoning hooks, memory summarization, semantic search, and in-session memory management. Ensure to replace '...' with your actual conversation messages.

```python
import asyncio

from reme.reme_light import ReMeLight


async def main():
    # Initialize ReMeLight
    reme = ReMeLight(
        default_as_llm_config={"model_name": "qwen3.5-35b-a3b"},
        # default_embedding_model_config={"model_name": "text-embedding-v4"},
        default_file_store_config={"fts_enabled": True, "vector_enabled": False},
        enable_load_env=True,
    )
    await reme.start()

    messages = [...]  # List of conversation messages

    # 1. Check context size (token counting, determine if compaction is needed)
    messages_to_compact, messages_to_keep, is_valid = await reme.check_context(
        messages=messages,
        memory_compact_threshold=90000,  # Threshold to trigger compaction (tokens)
        memory_compact_reserve=10000,  # Token count to reserve for recent messages
    )

    # 2. Compact conversation history into a structured summary
    summary = await reme.compact_memory(
        messages=messages,
        previous_summary="",
        max_input_length=128000,  # Model context window (tokens)
        compact_ratio=0.7,  # Trigger compaction when exceeding max_input_length * 0.7
        language="zh",  # Summary language (e.g., "zh" / "")
    )

    # 3. Compact long tool outputs (prevent tool results from blowing up context)
    messages = await reme.compact_tool_result(messages)

    # 4. Pre-reasoning hook (auto compact tool results + check context + generate summaries)
    processed_messages, compressed_summary = await reme.pre_reasoning_hook(
        messages=messages,
        system_prompt="You are a helpful AI assistant.",
        compressed_summary="",
        max_input_length=128000,
        compact_ratio=0.7,
        memory_compact_reserve=10000,
        enable_tool_result_compact=True,
        tool_result_compact_keep_n=3,
    )

    # 5. Persist important memory to files (writes to memory/YYYY-MM-DD.md)
    summary_result = await reme.summary_memory(
        messages=messages,
        language="zh",
    )

    # 6. Semantic memory search (vector + BM25 hybrid retrieval)
    result = await reme.memory_search(query="Python version preference", max_results=5)

    # 7. Create in-session memory instance (manages context for one conversation)
    memory = reme.get_in_memory_memory()  # Auto-configures dialog_path
    for msg in messages:
        await memory.add(msg)
    token_stats = await memory.estimate_tokens(max_input_length=128000)
    print(f"Current context usage: {token_stats['context_usage_ratio']:.1f}%")
    print(f"Message token count: {token_stats['messages_tokens']}")
    print(f"Estimated total tokens: {token_stats['estimated_tokens']}")

    # 8. Mark messages as compressed (auto-persists to dialog/YYYY-MM-DD.jsonl)
    # await memory.mark_messages_compressed(messages_to_compact)

    # Shutdown ReMeLight
    await reme.close()


if __name__ == "__main__":
    asyncio.run(main())

```

--------------------------------

### Run Local seekdb Docker Container

Source: https://github.com/modelscope/memoryscope/blob/main/docs/vector_store_api_guide.md

Use this command to start a local seekdb instance using Docker. Ensure you replace `<your_root_password>` with your desired password.

```text
docker run -d --name reme_seekdb -p 2881:2881 -e ROOT_PASSWORD=<your_root_password> quay.io/oceanbase/seekdb:latest
```

--------------------------------

### Configure Vector Store in YAML

Source: https://github.com/modelscope/memoryscope/blob/main/docs/vector_store_api_guide.md

Example of how to configure a vector store backend in a YAML configuration file. Specify the backend type and the embedding model to use.

```yaml
vector_store:
  default:
    backend: <backend_name>        # Required: vector store backend type
    embedding_model: default       # Required: name of embedding model config
    params:                        # Optional: backend-specific parameters
      # Backend-specific parameters
```

--------------------------------

### Advanced ReMe MCP Server Configuration

Source: https://github.com/modelscope/memoryscope/blob/main/docs/mcp_quick_start.md

An example of a full configuration for the ReMe MCP server, demonstrating advanced options like binding to a specific host and port, and using different backends for vector storage.

```bash
# Full configuration example
reme \
  backend=mcp \
  mcp.transport=stdio \
  http_service.host=0.0.0.0 \
  http_service.port=8002 \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=elasticsearch \

```

--------------------------------

### Web Research Agent Token Accumulation Example

Source: https://github.com/modelscope/memoryscope/blob/main/docs/work_memory/message_offload.md

Illustrates how token count rapidly increases with each tool call in a web research agent, leading to potential context window exhaustion.

```text
Iteration 1: web_search("AI context management") → 3,500 tokens
Iteration 2: read_webpage(url_1) → 8,200 tokens
Iteration 3: web_search("context compression techniques") → 4,100 tokens
Iteration 4: read_webpage(url_2) → 7,800 tokens
...
Iteration 15: summarize_findings() → Total context: 95,000 tokens
```

--------------------------------

### Configure ReMe Vector Store Backends

Source: https://context7.com/modelscope/memoryscope/llms.txt

Shows how to initialize the ReMe client with different vector store backends for development and production. Supports local file-based, Qdrant, Elasticsearch, and ChromaDB.

```python
# Local file-based (development)
reme = ReMe(default_vector_store_config={"backend": "local"})

# Qdrant (production)
reme = ReMe(default_vector_store_config={
    "backend": "qdrant",
    "params": {"host": "localhost", "port": 6333, "distance": "COSINE"},
})

# Elasticsearch (hybrid search)
reme = ReMe(default_vector_store_config={
    "backend": "elasticsearch",
    "params": {"hosts": "http://localhost:9200"},
})

# ChromaDB
reme = ReMe(default_vector_store_config={
    "backend": "chroma",
    "params": {"store_dir": "./chroma_store"},
})

```

--------------------------------

### Start HTTP Service

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cookbook/working/quick_start.md

Starts the HTTP service that exposes the flow execution endpoint for memory operations.

```bash
reme backend=http http.port=8003
```

--------------------------------

### Configure QdrantVectorStore (Local)

Source: https://github.com/modelscope/memoryscope/blob/main/docs/vector_store_api_guide.md

Configuration for a local Qdrant instance. Specify `host`, `port`, and `distance` metric.

```yaml
vector_store:
  default:
    backend: qdrant
    embedding_model: default
    params:
      host: "localhost"      # Qdrant server host (optional; default: localhost)
      port: 6333             # Qdrant server port (optional; default: 6333)
      distance: "COSINE"     # Distance metric (optional; default: COSINE; options: COSINE, EUCLIDEAN, DOT)
```

```shell
vector_store.default.backend=qdrant
vector_store.default.params.host=localhost
vector_store.default.params.port=6333
vector_store.default.params.distance=COSINE
```

--------------------------------

### Execute Full Benchmark

Source: https://github.com/modelscope/memoryscope/blob/main/docs/tool_memory/tool_bench.md

Runs the complete benchmark process, including 3 epochs with 60 queries each. This provides comprehensive results.

```bash
# Full benchmark (3 epochs, 60+60 queries per epoch)
python cookbook/tool_memory/run_reme_tool_bench.py
```

--------------------------------

### Run Working Memory Demo Script

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cookbook/working/quick_start.md

Navigate to the demo directory and execute the Python script to run the working memory demonstration.

```bash
cd cookbook/working_memory
python work_memory_demo.py
```

--------------------------------

### Start MCP Service

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cookbook/working/quick_start.md

Starts the Message Communication Protocol (MCP) service, which provides tools for working memory management like searching and reading content.

```bash
reme backend=mcp mcp.port=8002
```

--------------------------------

### All-in-one pre-reasoning context manager

Source: https://context7.com/modelscope/memoryscope/llms.txt

Employ `pre_reasoning_hook` as the entry point before each agent reasoning step. It combines tool-result compaction, context checking, synchronous `compact_memory`, and asynchronous `summary_memory`.

```python
processed_messages, compressed_summary = await reme.pre_reasoning_hook(
    messages=messages,
    system_prompt="You are a helpful AI assistant.",
    compressed_summary="",          # pass previous summary for continuity
    max_input_length=128_000,
    compact_ratio=0.7,
    memory_compact_reserve=10_000,
    enable_tool_result_compact=True,
    tool_result_compact_keep_n=3,   # do not compact the 3 most recent messages
    language="zh",
)

print(f"Messages reduced: {len(messages)} → {len(processed_messages)}")
print(f"Summary length:   {len(compressed_summary)} chars")
# Messages reduced: 22 → 4
# Summary length:   1842 chars

```

--------------------------------

### Start ReMe MCP Server with SSE Transport

Source: https://github.com/modelscope/memoryscope/blob/main/docs/mcp_quick_start.md

Start the ReMe MCP server using the Server-Sent Events (SSE) transport protocol. This option requires specifying an HTTP port.

```bash
reme \
  backend=mcp \
  mcp.transport=sse \
  http_service.port=8001 \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=local
```

--------------------------------

### System Commands for Conversation Control

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cli/quick_start_zh.md

Use these commands starting with '/' to control the conversation state, including manual compression, starting new conversations, clearing history, and accessing help.

```bash
/compact
```

```bash
/new
```

```bash
/clear
```

```bash
/history
```

```bash
/help
```

```bash
/exit
```

--------------------------------

### Generate usage guidelines from tool history

Source: https://context7.com/modelscope/memoryscope/llms.txt

Create usage guidelines for specific tools based on their historical execution data. This helps in understanding optimal tool usage patterns.

```bash
curl -X POST http://localhost:8002/summary_tool_memory \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "tool_workspace",
    "tool_names": "web_search"
  }'
```

--------------------------------

### POST /retrieve_tool_memory

Source: https://context7.com/modelscope/memoryscope/llms.txt

Get tool usage guidelines before invocation.

```APIDOC
## POST /retrieve_tool_memory

### Description
Retrieves pre-generated usage guidelines for a specific tool before it is invoked.

### Method
POST

### Endpoint
/retrieve_tool_memory

### Request Body
- **workspace_id** (string) - Required - Identifier for the workspace.
- **tool_names** (string) - Required - The name of the tool for which to retrieve guidelines.

### Request Example
```json
{
  "workspace_id": "tool_workspace",
  "tool_names": "web_search"
}
```

### Response
#### Success Response (200)
- **answer** (string) - The usage guidelines for the specified tool.
```

--------------------------------

### Preprocess BFCL Data

Source: https://github.com/modelscope/memoryscope/blob/main/benchmark/bfcl/quickstart.md

Preprocess the dataset to get it into the suitable format for BFCL experiments.

```bash
python preprocess.py
```

--------------------------------

### Clone BFCL Repository

Source: https://github.com/modelscope/memoryscope/blob/main/benchmark/bfcl/quickstart.md

Clone the BFCL repository and checkout the specific commit for setup.

```bash
cd ReMe/benchmark/bfcl
git clone https://github.com/ShishirPatil/gorilla.git
cd gorilla
git checkout ea13468
```

--------------------------------

### Clone ReMe Repository

Source: https://github.com/modelscope/memoryscope/blob/main/docs/cookbook/frozenlake/quickstart.md

Clone the ReMe repository to access the FrozenLake experiment code and follow setup instructions.

```bash
git clone https://github.com/agentscope-ai/ReMe.git
cd ReMe/cookbook/frozenlake
```

--------------------------------

### Agent Tool Selection Logic

Source: https://github.com/modelscope/memoryscope/blob/main/docs/tool_memory/tool_bench.md

Demonstrates how an agent uses an LLM to select the most appropriate tool from a list of available tools based on a given query.

```python
# Agent uses LLM to select appropriate tool
tool_call = await self.select_tool(query, [SearchToolA(), SearchToolB(), SearchToolC()])

# Execute selected tool and record results
result = ToolCallResult(
    create_time=timestamp,
    tool_name=tool_call.name,
    input={"query": query},
    output=content,
    token_cost=token_cost,
    success=success,
    time_cost=time_cost
)
```

--------------------------------

### Initialize API Configuration

Source: https://github.com/modelscope/memoryscope/blob/main/docs/task_memory/task_memory.md

Sets up the base URL and workspace ID for interacting with the ReMe API.

```python
import requests

# API configuration
BASE_URL = "http://0.0.0.0:8002/"
WORKSPACE_ID = "your_workspace_id"
```

--------------------------------

### ReMeLight.__init__

Source: https://context7.com/modelscope/memoryscope/llms.txt

Initializes the ReMeLight file-based memory application, setting up the working directory structure and configuring default settings for LLM, embedding models, and file storage.

```APIDOC
## ReMeLight.__init__ — Initialize file-based memory application

Creates a `ReMeLight` instance and sets up the working directory structure (`memory/`, `tool_result/`, `dialog/`). All paths are created automatically.

```python
import asyncio
from reme.reme_light import ReMeLight

async def main():
    reme = ReMeLight(
        working_dir=".reme",                          # root directory for all data
        default_as_llm_config={"model_name": "qwen3.5-35b-a3b"},
        default_embedding_model_config={"model_name": "text-embedding-v4"},
        default_file_store_config={"fts_enabled": True, "vector_enabled": True},
        vector_weight=0.7,          # weight for vector search in hybrid retrieval
        candidate_multiplier=3.0,   # retrieve 3x candidates before re-ranking
        enable_load_env=True,       # load .env automatically
    )
    await reme.start()
    # ... use reme ...
    await reme.close()

asyncio.run(main())
```
```

--------------------------------

### Initialize MCP Client for Memory-Augmented Agent

Source: https://github.com/modelscope/memoryscope/blob/main/docs/mcp_quick_start.md

This code snippet demonstrates the necessary imports and environment variable loading for setting up an MCP client to build a memory-augmented agent.

```python
import json
import asyncio
from fastmcp import Client
from dotenv import load_dotenv

# Load environment variables
load_dotenv()
```

--------------------------------

### Configure QdrantVectorStore (Cloud)

Source: https://github.com/modelscope/memoryscope/blob/main/docs/vector_store_api_guide.md

Configuration for Qdrant Cloud. Requires `url` and `api_key`.

```yaml
vector_store:
  default:
    backend: qdrant
    embedding_model: default
    params:
      url: "https://your-cluster.qdrant.io:6333"  # Qdrant Cloud URL
      api_key: "your-api-key-here"                 # API key
      distance: "COSINE"
```

```shell
vector_store.default.backend=qdrant
vector_store.default.params.url=https://your-cluster.qdrant.io:6333
vector_store.default.params.api_key=your-api-key-here
vector_store.default.params.distance=COSINE
```

--------------------------------

### Run Tool Memory Demo

Source: https://github.com/modelscope/memoryscope/blob/main/docs/tool_memory/tool_memory.md

Execute this command to run a complete Python demo script that showcases the Tool Memory lifecycle, including workspace management, tool call recording, summarization, retrieval, and persistence.

```bash
cd cookbook/simple_demo
python use_tool_memory_demo.py
```

--------------------------------

### Start ReMe MCP Server with STDIO Transport

Source: https://github.com/modelscope/memoryscope/blob/main/docs/mcp_quick_start.md

Launch the ReMe MCP server using the STDIO transport protocol. This is recommended for direct integration with MCP clients.

```bash
reme \
  backend=mcp \
  mcp.transport=stdio \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=local
```

--------------------------------

### Execute Quick Benchmark Test

Source: https://github.com/modelscope/memoryscope/blob/main/docs/tool_memory/tool_bench.md

Runs a quick test of the benchmark with fewer queries per epoch (15+15) and a specified number of epochs. Useful for rapid testing.

```bash
# Quick test (3 epochs, 15+15 queries per epoch)
# Modify main() call: main(test_mode=True, run_epoch=3)
```

--------------------------------

### Memory Retrieval: Get personal memory fragments

Source: https://github.com/modelscope/memoryscope/blob/main/docs/quick_start.md

This operation retrieves personal memory fragments based on a query. It allows you to fetch relevant information from the user's past interactions.

```APIDOC
## POST /retrieve_personal_memory

### Description
Retrieves personal memory fragments based on a query.

### Method
POST

### Endpoint
/retrieve_personal_memory

### Parameters
#### Request Body
- **workspace_id** (string) - Required - The ID of the workspace.
- **query** (string) - Required - The query to retrieve memories.
- **top_k** (integer) - Required - The number of top results to return.
```

--------------------------------

### ReMe.__init__

Source: https://context7.com/modelscope/memoryscope/llms.txt

Initializes a ReMe instance, setting up the vector-based memory application with configurable components for different memory types and user profiles.

```APIDOC
## ReMe.__init__ — Initialize vector-based memory application

Creates a `ReMe` instance backed by a configurable vector store. Supports personal, procedural, and tool memory with optional user profile management.

### Parameters

- **working_dir** (str) - Directory for storing memory data.
- **default_llm_config** (dict) - Configuration for the default Language Model.
- **default_embedding_model_config** (dict) - Configuration for the default embedding model.
- **default_vector_store_config** (dict) - Configuration for the default vector store.
- **target_user_names** (list[str], optional) - List of user names to pre-register for personal memory.
- **target_task_names** (list[str], optional) - List of task names to pre-register for procedural memory.
- **enable_profile** (bool) - Whether to enable user profile management.
- **profile_backend** (str) - Backend for profile storage ('filesystem' or 'vector').

### Example
```python
import asyncio
from reme import ReMe

async def main():
    reme = ReMe(
        working_dir=".reme",
        default_llm_config={
            "backend": "openai",
            "model_name": "qwen3.5-plus",
        },
        default_embedding_model_config={
            "backend": "openai",
            "model_name": "text-embedding-v4",
            "dimensions": 1024,
        },
        default_vector_store_config={
            "backend": "local",  # local | memory | chroma | qdrant | elasticsearch | obvec | zvec | hologres
        },
        target_user_names=["alice"],   # pre-register personal memory targets
        target_task_names=["coding"],  # pre-register procedural memory targets
        enable_profile=True,
        profile_backend="filesystem",  # filesystem | vector
    )
    await reme.start()
    # ... use reme ...
    await reme.close()

asyncio.run(main())
```
```

--------------------------------

### Clone Repository and Download Data

Source: https://github.com/modelscope/memoryscope/blob/main/benchmark/longmemeval/quickstart.md

Clones the repository, creates a data directory, and downloads necessary dataset files using wget.

```bash
cd ./benchmark/longmemeval
mkdir -p data/
cd data/
wget https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned/resolve/main/longmemeval_oracle.json
wget https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned/resolve/main/longmemeval_s_cleaned.json
wget https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned/resolve/main/longmemeval_m_cleaned.json
cd ..
```

--------------------------------

### Memory-Augmented Agent Workflow with MCP Client

Source: https://context7.com/modelscope/memoryscope/llms.txt

Demonstrates a full workflow using the MCP Python client to analyze business models. It includes running an agent, summarizing conversation history into task memories, retrieving relevant memories for a new query, and then running a memory-augmented query for a final answer. Requires the fastmcp library and a running MCP server.

```python
import json
import asyncio
from fastmcp import Client

MCP_URL = "http://0.0.0.0:8002/sse/"
WORKSPACE_ID = "research_workspace"


async def memory_augmented_workflow():
    async with Client(MCP_URL) as client:
        # 1. Run agent on first query to build conversation history
        result = await client.call_tool("react", arguments={"query": "Analyze Tesla's business model"})
        messages = json.loads(result.content).get("messages", [])

        # 2. Summarize conversation into task memories
        result = await client.call_tool(
            "summary_task_memory",
            arguments={
                "workspace_id": WORKSPACE_ID,
                "trajectories": [{"messages": messages, "score": 1.0}],
            },
        )
        memory_list = json.loads(result.content).get("metadata", {}).get("memory_list", [])
        print(f"Stored {len(memory_list)} memories")

        # 3. Retrieve relevant memories for a related query
        result = await client.call_tool(
            "retrieve_task_memory",
            arguments={"workspace_id": WORKSPACE_ID, "query": "Analyze Xiaomi's business model"},
        )
        retrieved = json.loads(result.content).get("answer", "")

        # 4. Run memory-augmented query
        augmented_query = f"{retrieved}\n\nUser Question:\nAnalyze Xiaomi's business model"
        result = await client.call_tool("react", arguments={"query": augmented_query})
        final_answer = ""
        for msg in json.loads(result.content).get("messages", []):
            if msg.get("role") == "assistant":
                final_answer = msg.get("content", "")
                break
        print(f"Answer: {final_answer[:200]}")


asyncio.run(memory_augmented_workflow())

```