# MLIT Data Platform MCP Server

The MLIT Data Platform MCP Server is a Model Context Protocol (MCP) server that enables Large Language Models (LLMs) to interact with Japan's Ministry of Land, Infrastructure, Transport and Tourism (MLIT) Data Platform. This server provides a bridge between AI assistants like Claude and the comprehensive geospatial data repository containing infrastructure, transportation, and urban planning datasets across Japan.

The server wraps the MLIT GraphQL API and exposes 17 specialized tools for searching, retrieving, and downloading data. Key capabilities include keyword search with spatial filtering (rectangle and radius-based), attribute-based metadata queries, bulk data retrieval with pagination, and file download URL generation. The server handles authentication, rate limiting, retry logic with exponential backoff, and automatic normalization of Japanese prefecture and municipality codes from various input formats (kanji, romaji, or numeric codes).

## Installation and Setup

Configure the MCP server in your Claude Desktop configuration file.

```json
{
  "mcpServers": {
    "mlit-dpf-mcp": {
      "command": "/path/to/mlit-dpf-mcp/.venv/bin/python",
      "args": ["/path/to/mlit-dpf-mcp/src/server.py"],
      "env": {
        "MLIT_API_KEY": "your_api_key_here",
        "MLIT_BASE_URL": "https://data-platform.mlit.go.jp/api/v1/",
        "PYTHONUNBUFFERED": "1",
        "LOG_LEVEL": "WARNING"
      }
    }
  }
}
```

## search

Search for data using keywords with optional sorting and pagination. Supports full-text search with phrase matching and synonym expansion across the entire MLIT data catalog.

```python
# Basic keyword search for bridges
await client.search_keyword(
    term="橋梁",
    first=0,
    size=50,
    phrase_match=True
)

# Search with sorting by update date (descending)
await client.search_keyword(
    term="バス停",
    sort_attribute_name="DPF:updated_at",
    sort_order="dsc",
    first=0,
    size=100
)

# Response structure:
{
    "search": {
        "totalNumber": 1523,
        "searchResults": [
            {
                "id": "abc123-def456",
                "title": "国道1号橋梁データ",
                "lat": 35.6812,
                "lon": 139.7671,
                "year": 2023,
                "dataset_id": "road_bridges",
                "catalog_id": "mlit_infrastructure"
            }
        ]
    }
}
```

## search_by_location_rectangle

Search for data within a rectangular geographic boundary defined by northwest and southeast coordinates. Useful for finding all datasets within a specific region.

```python
# Search for bridges in Tokyo metropolitan area
location_filter = client.make_rectangle_filter(
    tl_lat=35.80,   # Northwest latitude
    tl_lon=139.55,  # Northwest longitude
    br_lat=35.60,   # Southeast latitude
    br_lon=139.85   # Southeast longitude
)

data = await client.search_by_rectangle(
    tl_lat=35.80,
    tl_lon=139.55,
    br_lat=35.60,
    br_lon=139.85,
    term="橋梁",
    first=0,
    size=50,
    phrase_match=True
)

# Response includes geospatial data within bounds:
{
    "search": {
        "totalNumber": 342,
        "searchResults": [
            {
                "id": "bridge-001",
                "title": "隅田川橋梁群",
                "lat": 35.7102,
                "lon": 139.7845,
                "dataset_id": "tokyo_bridges"
            }
        ]
    }
}
```

## search_by_location_point_distance

Search for data within a circular area defined by a center point and radius in meters. Ideal for proximity-based searches around landmarks or addresses.

```python
# Find bus stops within 500m of Tokyo Station
data = await client.search_by_point(
    lat=35.681236,      # Tokyo Station latitude
    lon=139.767125,     # Tokyo Station longitude
    distance_m=500,     # 500 meter radius
    term="バス停",
    first=0,
    size=50,
    phrase_match=True
)

# Find all road data within 5km of a point
data = await client.search_by_point(
    lat=35.68,
    lon=139.75,
    distance_m=5000,
    term="道路",
    size=100
)

# Response:
{
    "search": {
        "totalNumber": 28,
        "searchResults": [
            {
                "id": "busstop-tokyo-001",
                "title": "東京駅八重洲口バス停",
                "lat": 35.6803,
                "lon": 139.7682
            }
        ]
    }
}
```

## search_by_attribute

Search using metadata attributes like dataset ID, prefecture code, municipality code, or year. Supports precise filtering with the GraphQL attributeFilter syntax.

```python
# Search by dataset ID
data = await client.search_by_attribute_raw(
    attribute_name="DPF:dataset_id",
    attribute_value="mlit-plateau-2023",
    term="",
    first=0,
    size=50,
    phrase_match=True
)

# Search by prefecture code (Tokyo = 13)
data = await client.search_by_attribute_raw(
    attribute_name="DPF:prefecture_code",
    attribute_value="13",
    term="橋梁",
    size=100
)

# Search by year
data = await client.search_by_attribute_raw(
    attribute_name="DPF:year",
    attribute_value="2023",
    term=""
)

# Response:
{
    "search": {
        "totalNumber": 156,
        "searchResults": [
            {
                "id": "data-001",
                "title": "PLATEAU 3D都市モデル",
                "dataset_id": "mlit-plateau-2023",
                "catalog_id": "plateau"
            }
        ]
    }
}
```

## get_data

Retrieve detailed information for a specific data record using dataset ID and data ID. Returns complete metadata, file attachments, and tileset information.

```python
# Get full data details
data = await client.get_data(
    dataset_id="cals_construction",
    data_id="8fb65cb6-a7e3-4b15-bf17-1c71be572a9f"
)

# Response with full metadata and files:
{
    "data": {
        "totalNumber": 1,
        "getDataResults": [
            {
                "id": "8fb65cb6-a7e3-4b15-bf17-1c71be572a9f",
                "title": "道路橋梁点検データ 2023年度",
                "metadata": {
                    "DPF:year": 2023,
                    "DPF:prefecture_code": "13",
                    "DPF:address": "東京都千代田区"
                },
                "files": [
                    {"id": "file-001", "original_path": "INDEX_C.XML"},
                    {"id": "file-002", "original_path": "data/bridge_data.csv"}
                ],
                "hasThumbnail": true,
                "tileset": {
                    "url": "https://example.com/tileset.json",
                    "altitude_offset_meters": 0
                }
            }
        ]
    }
}
```

## get_data_summary

Retrieve lightweight summary information (ID and title only) for a specific data record. Use this for quick lookups before fetching full details.

```python
# Get basic info only
data = await client.get_data_summary(
    dataset_id="cals_construction",
    data_id="8fb65cb6-a7e3-4b15-bf17-1c71be572a9f"
)

# Response:
{
    "data": {
        "totalNumber": 1,
        "getDataResults": [
            {
                "id": "8fb65cb6-a7e3-4b15-bf17-1c71be572a9f",
                "title": "道路橋梁点検データ 2023年度"
            }
        ]
    }
}
```

## get_data_catalog

Retrieve data catalog information including all datasets and their data counts. Supports filtering by catalog IDs and controlling dataset inclusion.

```python
# Get all catalogs with their datasets
data = await client.get_data_catalog(
    ids=None,  # None = all catalogs
    minimal=False,
    include_datasets=True
)

# Get specific catalogs only
data = await client.get_data_catalog(
    ids=["cals", "rsdb", "plateau"],
    include_datasets=True
)

# Response:
{
    "dataCatalog": [
        {
            "id": "cals",
            "title": "CALS電子納品データ",
            "datasets": [
                {"id": "cals_construction", "title": "工事完成図書", "data_count": 45230},
                {"id": "cals_survey", "title": "測量成果", "data_count": 12500}
            ]
        },
        {
            "id": "plateau",
            "title": "Project PLATEAU",
            "datasets": [
                {"id": "plateau_building", "title": "建物モデル", "data_count": 230000}
            ]
        }
    ]
}
```

## get_data_catalog_summary

Retrieve a lightweight list of all catalog IDs and titles without dataset details.

```python
# Get catalog overview
data = await client.get_data_catalog_summary()

# Response:
{
    "dataCatalog": [
        {"id": "cals", "title": "CALS電子納品データ"},
        {"id": "rsdb", "title": "道路施設データベース"},
        {"id": "plateau", "title": "Project PLATEAU"},
        {"id": "dimaps", "title": "デジタル道路地図"}
    ]
}
```

## get_all_data

Bulk retrieve large datasets using batch processing with automatic pagination. Supports spatial and attribute filters with configurable batch limits.

```python
from src.schemas import GetAllDataInput

# Bulk retrieve with filters
params = GetAllDataInput(
    size=1000,  # Max per batch (API limit)
    term="",
    dataset_id="mlit-001",
    prefecture_code="13",
    max_batches=10,  # Up to 10,000 items
    include_metadata=True
)

data = await client.get_all_data_collect(params, max_items=5000)

# Bulk retrieve with rectangle filter
params = GetAllDataInput(
    size=1000,
    catalog_id="dimaps",
    location_rectangle_top_left_lat=35.80,
    location_rectangle_top_left_lon=139.55,
    location_rectangle_bottom_right_lat=35.60,
    location_rectangle_bottom_right_lon=139.85,
    max_batches=5
)

data = await client.get_all_data_collect(params)

# Response:
{
    "batches": 5,
    "count": 4523,
    "items": [
        {
            "id": "item-001",
            "title": "データタイトル",
            "metadata": {"DPF:year": 2023, "DPF:prefecture_code": "13"}
        }
    ]
}
```

## get_count_data

Get data counts with optional aggregation by attribute or dataset. Useful for understanding data distribution before bulk retrieval.

```python
from src.schemas import CountDataInput

# Count by dataset
params = CountDataInput(
    term="橋梁",
    slice_setting={"type": "dataset"}
)
data = await client.count_data(params)

# Count by prefecture with sub-aggregation by year
params = CountDataInput(
    term="",
    dataset_id="road_bridges",
    slice_setting={
        "type": "attribute",
        "attributeSliceSetting": {
            "attributeName": "DPF:prefecture_code",
            "size": 10,
            "subSliceSetting": {
                "attributeName": "DPF:year",
                "size": 5
            }
        }
    }
)
data = await client.count_data(params)

# Response with hierarchical counts:
{
    "countData": {
        "dataCount": 125000,
        "slices": [
            {
                "attributeName": "DPF:prefecture_code",
                "attributeValue": "13",
                "dataCount": 15230,
                "slices": [
                    {"attributeName": "DPF:year", "attributeValue": "2023", "dataCount": 3200},
                    {"attributeName": "DPF:year", "attributeValue": "2022", "dataCount": 2800}
                ]
            }
        ]
    }
}
```

## get_suggest

Get keyword suggestions based on partial input. Useful for autocomplete functionality and discovering related search terms.

```python
from src.schemas import SuggestInput

# Basic suggestions
params = SuggestInput(
    term="川",
    phrase_match=True
)
data = await client.suggest(params)

# Suggestions filtered by dataset
params = SuggestInput(
    term="橋",
    dataset_id="cals_construction",
    phrase_match=True
)
data = await client.suggest(params)

# Response:
{
    "suggest": {
        "totalNumber": 15,
        "suggestions": [
            {"name": "河川", "cnt": 45230},
            {"name": "川河川", "cnt": 12500},
            {"name": "河川護岸", "cnt": 8900}
        ]
    }
}
```

## get_prefecture_data

Retrieve the complete list of Japan's 47 prefectures with their codes and names.

```python
# Get all prefectures
data = await client.get_prefectures()

# Response:
{
    "prefecture": [
        {"code": 1, "name": "北海道"},
        {"code": 2, "name": "青森県"},
        {"code": 13, "name": "東京都"},
        {"code": 27, "name": "大阪府"},
        {"code": 47, "name": "沖縄県"}
    ]
}
```

## get_municipality_data

Retrieve municipality (city/ward/town/village) information filtered by prefecture or municipality codes.

```python
# Get all municipalities in Tokyo
data = await client.get_municipalities(
    pref_codes=["13"],
    fields=["code_as_string", "prefecture_code", "name", "katakana"]
)

# Get specific municipalities by code
data = await client.get_municipalities(
    muni_codes=["13101", "13102", "13103"]
)

# Response:
{
    "municipalities": [
        {"code_as_string": "13101", "prefecture_code": "13", "name": "千代田区"},
        {"code_as_string": "13102", "prefecture_code": "13", "name": "中央区"},
        {"code_as_string": "13103", "prefecture_code": "13", "name": "港区"}
    ]
}
```

## normalize_codes

Normalize various input formats for prefecture and municipality names/codes. Accepts Japanese, romaji, or numeric codes and returns canonical values.

```python
from src.schemas import NormalizeCodesInput

# Normalize from romaji
params = NormalizeCodesInput(prefecture="Tokyo", municipality=None)
result = await client.normalize_codes(params)
# Returns: prefecture_code="13", prefecture_name="東京都"

# Normalize from Japanese
params = NormalizeCodesInput(prefecture="大阪", municipality="堺市")
result = await client.normalize_codes(params)
# Returns: prefecture_code="27", municipality_code="27140"

# Normalize from numeric code
params = NormalizeCodesInput(prefecture="13", municipality="13101")
result = await client.normalize_codes(params)

# Response structure:
{
    "prefecture_code": "13",
    "prefecture_name": "東京都",
    "municipality_code": "13101",
    "municipality_name": "千代田区",
    "candidates": [],
    "warnings": [],
    "normalization_meta": {
        "input_prefecture": "Tokyo",
        "matched_strategy": "pref:romaji"
    }
}
```

## get_mesh

Retrieve mesh-based statistical data (like population census) for specific mesh codes.

```python
# Get 250m mesh population data
data = await client.get_mesh(
    dataset_id="dpf_population_data",
    data_id="8fb65cb6-a7e3-4b15-bf17-1c71be572a9f",
    mesh_id="national_sensus_250m_r2",
    mesh_code="5339452932"
)

# Response (mesh-specific data):
{
    "mesh": {
        "meshCode": "5339452932",
        "totalPopulation": 1523,
        "households": 856
    }
}
```

## get_file_download_urls

Generate temporary download URLs for data files. URLs for MLIT-hosted files expire after 60 seconds.

```python
from src.schemas import FileRef

# Get URLs for specific files
files = [
    FileRef(id="file-001", original_path="INDEX_C.XML"),
    FileRef(id="file-002", original_path="data/bridge.csv")
]
data = await client.file_download_urls(files=files)

# Or get all files for a data record
data = await client.file_download_urls_from_data(
    dataset_id="cals_construction",
    data_id="8fb65cb6-a7e3-4b15-bf17-1c71be572a9f"
)

# Response:
{
    "fileDownloadURLs": [
        {
            "ID": "file-001",
            "URL": "https://data-platform.mlit.go.jp/download/abc123?token=xyz"
        },
        {
            "ID": "file-002",
            "URL": "https://data-platform.mlit.go.jp/download/def456?token=xyz"
        }
    ]
}
```

## get_zipfile_download_url

Generate a temporary URL to download multiple files as a single ZIP archive. URL expires after 60 seconds.

```python
# Create ZIP of multiple files
files = [
    FileRef(id="file-001", original_path="model_a.ifc"),
    FileRef(id="file-002", original_path="model_b.ifc"),
    FileRef(id="file-003", original_path="model_c.ifc")
]
data = await client.zipfile_download_url(files=files)

# Or ZIP all files for a data record
data = await client.zipfile_download_url_from_data(
    dataset_id="cals_construction",
    data_id="8fb65cb6-a7e3-4b15-bf17-1c71be572a9f"
)

# Response:
{
    "zipfileDownloadURL": "https://data-platform.mlit.go.jp/download/zip/abc123?token=xyz"
}
```

## get_thumbnail_urls

Generate temporary URLs for data thumbnail images. Useful for previewing datasets.

```python
# Get thumbnail URLs for a data record
data = await client.thumbnail_urls_from_data(
    dataset_id="ndm",
    data_id="8fb65cb6-a7e3-4b15-bf17-1c71be572a9f"
)

# Response:
{
    "thumbnailURLs": [
        {
            "ID": "thumb-001",
            "URL": "https://data-platform.mlit.go.jp/download/thumb/abc123?token=xyz"
        }
    ]
}
```

## Summary

The MLIT Data Platform MCP Server enables conversational access to Japan's comprehensive infrastructure and geospatial data. Primary use cases include searching for infrastructure data (bridges, roads, buildings) using keywords or spatial queries, exploring the PLATEAU 3D city model dataset, accessing road facility inspection records, and downloading associated files. The server's normalize_codes tool simplifies working with Japanese geographic codes by accepting various input formats.

Integration with LLMs follows the MCP protocol, allowing AI assistants to autonomously search, filter, and retrieve government data based on natural language requests. The typical workflow involves normalizing location input, searching with spatial or attribute filters, retrieving detailed metadata, and generating download URLs. The server handles pagination for large result sets and includes built-in rate limiting and retry logic for reliable API interactions.