# edinet

edinet は、日本の金融庁が運営する電子開示システム EDINET（Electronic Disclosure for Investors' NETwork）から財務情報を取得・解析するための Python ライブラリです。EDINET API クライアントと XBRL パーサーを統合し、有価証券報告書・四半期報告書・訂正報告書などの開示書類から財務データを構造化された形式で抽出できます。J-GAAP（日本基準）、IFRS（国際財務報告基準）、US-GAAP（米国基準）の3つの会計基準に対応しています。

本ライブラリの主要機能は、書類一覧取得、企業検索、XBRL 財務諸表の自動構築、正規化キーによる会計基準横断の財務数値抽出、訂正報告書チェーン管理、テキストブロック抽出、セグメント別データ分析、差分比較などです。ディスクキャッシュによる高速な再取得、非同期 API 対応、pandas DataFrame へのエクスポート機能も備えています。

## 設定・初期化

ライブラリを使用する前に、EDINET API キーとタクソノミパスを設定します。API キーは EDINET 公式サイトでアカウント作成後に発行されます。

```python
import edinet

# API キーとタクソノミパスの設定
edinet.configure(
    api_key="YOUR_API_KEY",
    taxonomy_path="/path/to/ALL_20251101",
    cache_dir="~/.cache/edinet",  # ディスクキャッシュを有効化
    timeout=60.0,                  # タイムアウト秒数
    max_retries=3,                 # リトライ回数
)

# タクソノミの自動インストール（初回のみ）
edinet.install_taxonomy(year=2026)  # 2026年版をインストール
info = edinet.taxonomy_info()
print(f"タクソノミ: {info.path}")
# => タクソノミ: /Users/xxx/Library/Application Support/edinet/ALL_20251101
```

## 書類一覧取得 (documents)

指定日または期間の提出書類一覧を Filing オブジェクトのリストで取得します。書類種別や提出者でフィルタリング可能です。

```python
import edinet
from datetime import date

# 単日の書類一覧を取得
filings = edinet.documents(date="2025-03-31")
print(f"{len(filings)} 件の書類が提出されています")

# 期間指定 + 書類種別フィルタ
filings = edinet.documents(
    start="2025-01-01",
    end="2025-03-31",
    doc_type="有価証券報告書",  # または DocType.ANNUAL_SECURITIES_REPORT または "120"
)

# 特定企業の書類を取得
filings = edinet.documents(
    start=date(2025, 1, 1),
    end=date(2025, 3, 31),
    edinet_code="E02144",  # トヨタ自動車
)

# 非同期版
async def fetch_documents():
    filings = await edinet.adocuments(date="2025-03-31")
    return filings

# Filing オブジェクトの属性
for f in filings[:3]:
    print(f"doc_id={f.doc_id}, 提出者={f.filer_name}, 種別={f.doc_type_label_ja}")
    print(f"  提出日時: {f.submit_date_time}, 期間: {f.period_start} ~ {f.period_end}")
    print(f"  has_xbrl={f.has_xbrl}, has_pdf={f.has_pdf}")
```

## 企業検索 (Company)

企業名、EDINET コード、証券コード、業種から企業を検索し、Company オブジェクトを取得します。

```python
from edinet import Company

# 企業名で検索（部分一致）
results = Company.search("トヨタ", limit=10)
for c in results:
    print(f"{c.edinet_code}: {c.name_ja} ({c.ticker})")
# => E02144: トヨタ自動車株式会社 (7203)

# EDINET コードから Company を取得
toyota = Company.from_edinet_code("E02144")
print(toyota.name_ja)  # => トヨタ自動車株式会社

# 証券コードから Company を取得
sony = Company.from_sec_code("6758")  # 4桁でも5桁でも可
print(sony.edinet_code)  # => E01777

# 業種で検索
auto_makers = Company.by_industry("輸送用機器", limit=50)
print(f"{len(auto_makers)} 社の輸送用機器メーカー")

# 上場企業一覧
all_listed = Company.all_listed()
print(f"{len(all_listed)} 社の上場企業")

# Company から書類を取得
filings = toyota.get_filings(
    start="2024-01-01",
    end="2025-03-31",
    doc_type="有価証券報告書",
)

# 最新の有価証券報告書を取得
latest_filing = toyota.latest(doc_type="有価証券報告書")
print(f"最新有報: {latest_filing.doc_id} ({latest_filing.filing_date})")
```

## XBRL 財務諸表 (xbrl / Statements)

Filing から XBRL を解析し、損益計算書・貸借対照表・キャッシュフロー計算書を構造化された FinancialStatement として取得します。

```python
import edinet

# 書類を取得して XBRL を解析
filings = edinet.documents(date="2025-06-27", doc_type="有価証券報告書")
filing = filings[0]

# XBRL パイプラインを実行（ZIP ダウンロード → パース → 構造化）
stmts = filing.xbrl()  # taxonomy_path は configure() で設定済み

# 損益計算書（連結・当期）
pl = stmts.income_statement(consolidated=True, period="current")
print(pl)
# 連結損益計算書 (2024-04-01 ～ 2025-03-31)
# ────────────────────────────────────────
# 売上高                    12,345,678,000,000
# 売上原価                   9,876,543,000,000
# 売上総利益                 2,469,135,000,000
# ...

# 貸借対照表（連結・当期末）
bs = stmts.balance_sheet(consolidated=True, period="current")

# キャッシュフロー計算書
cf = stmts.cash_flow_statement(consolidated=True)

# 前期比較
pl_prior = stmts.income_statement(period="prior")
print(f"前期売上高: {pl_prior['売上高'].value:,}")

# 個別財務諸表
pl_non_cons = stmts.income_statement(consolidated=False)

# 科目を検索
items = stmts.search("利益")
for item in items:
    print(f"{item.label_ja.text}: {item.value:,}")

# dict-like アクセス
revenue = stmts["売上高"]
print(f"売上高: {revenue.value:,} ({revenue.local_name})")

# 非同期版
async def parse_xbrl():
    stmts = await filing.axbrl()
    return stmts.income_statement()
```

## 正規化キー抽出 (extract_values / CK)

会計基準（J-GAAP / IFRS / US-GAAP）を意識せず、canonical key で財務数値を抽出します。サマリー情報と財務諸表本体から統一的に値を取得できます。

```python
from edinet import extract_values, extracted_to_dict
from edinet.financial.standards.canonical_keys import CK

# Statements から正規化キーで値を抽出
result = extract_values(
    stmts,
    [CK.REVENUE, CK.OPERATING_INCOME, CK.NET_INCOME, CK.TOTAL_ASSETS],
    period="current",
    consolidated=True,
)

# 結果を確認
for key, ev in result.items():
    if ev is not None:
        print(f"{key}: {ev.value:,} (source: {ev.mapper_name})")
# => revenue: 12,345,678,000,000 (source: statement_mapper)
# => operating_income: 1,234,567,000,000 (source: summary_mapper)

# pandas 連携用に dict 変換
row = extracted_to_dict(result)
# => {'revenue': Decimal('12345678000000'), 'operating_income': Decimal('1234567000000'), ...}

# 全マッピング可能科目を抽出（keys=None）
all_values = extract_values(stmts, period="current", consolidated=True)

# カスタムマッパーを使用
from edinet import calc_mapper, definition_mapper

result = extract_values(
    stmts,
    [CK.REVENUE, CK.GROSS_PROFIT],
    mapper=[definition_mapper(), calc_mapper()],  # 特定のマッパーのみ使用
)

# 利用可能な正規化キー（CK enum メンバー）
# PL: REVENUE, COST_OF_SALES, GROSS_PROFIT, SGA_EXPENSES, OPERATING_INCOME,
#     ORDINARY_INCOME, NET_INCOME, NET_INCOME_PARENT, ...
# BS: CASH_AND_DEPOSITS, TRADE_RECEIVABLES, INVENTORIES, TOTAL_ASSETS,
#     CURRENT_LIABILITIES, TOTAL_LIABILITIES, CAPITAL_STOCK, NET_ASSETS, ...
# CF: DEPRECIATION_CF, OPERATING_CF, INVESTING_CF, FINANCING_CF, ...
```

## 訂正報告書チェーン (build_revision_chain)

原本と訂正報告書を時系列で連結し、最新版の特定やバックテスト用の時点指定ができます。

```python
from edinet import build_revision_chain
from datetime import date

# Filing から訂正チェーンを構築
chain = build_revision_chain(filing)

# チェーン情報
print(f"原本: {chain.original.doc_id}")
print(f"最新版: {chain.latest.doc_id}")
print(f"訂正回数: {chain.count - 1}")
print(f"訂正あり: {chain.is_corrected}")

# 時系列で走査
for f in chain.chain:
    print(f"{f.doc_id}: {f.submit_date_time}")

# バックテスト用の時点指定
snapshot = chain.at_time(date(2025, 6, 1))
print(f"2025/6/1 時点の最新版: {snapshot.doc_id}")

# イテレータ対応
for revision in chain:
    stmts = revision.xbrl()
    print(f"{revision.doc_id}: 売上高={stmts['売上高'].value:,}")
```

## 差分比較 (diff_revisions / diff_periods)

訂正前後または前期・当期の財務諸表の差分を構造化して比較します。

```python
from edinet import diff_revisions, diff_periods

# 訂正差分
original_stmts = original_filing.xbrl()
corrected_stmts = corrected_filing.xbrl()
result = diff_revisions(original_stmts, corrected_stmts)

print(result.summary())
# => 変更: 3科目 (追加: 1, 削除: 0, 修正: 2)

for item in result.modified:
    print(f"{item.label_ja.text}: {item.old_value:,} → {item.new_value:,}")
    print(f"  差額: {item.difference:+,}")

print(f"追加された科目: {[i.label_ja.text for i in result.added]}")
print(f"削除された科目: {[i.label_ja.text for i in result.removed]}")

# 期間差分（前期 vs 当期）
pl_prior = stmts.income_statement(period="prior")
pl_current = stmts.income_statement(period="current")
period_diff = diff_periods(pl_prior, pl_current)

if period_diff.has_changes:
    for item in period_diff.modified:
        pct = (item.difference / item.old_value * 100) if item.old_value else 0
        print(f"{item.label_ja.text}: {pct:+.1f}% ({item.difference:+,})")
```

## テキストブロック抽出 (extract_text_blocks)

有価証券報告書の注記・MD&A・リスク情報などの非数値テキストセクションを抽出します。

```python
from edinet import extract_text_blocks, build_section_map, clean_html

# テキストブロックを抽出
blocks = extract_text_blocks(stmts)

for block in blocks[:5]:
    print(f"concept: {block.concept}")
    print(f"期間: {block.period}")
    print(f"連結: {block.is_consolidated}")
    print(f"HTML長: {len(block.html)} 文字")
    # HTML をクリーンアップ
    text = clean_html(block.html)
    print(f"内容: {text[:200]}...")
    print("---")

# セクションマップを構築（concept → TextBlock の辞書）
section_map = build_section_map(stmts)

# 事業リスクを取得
if "BusinessRisksTextBlock" in section_map:
    risk_block = section_map["BusinessRisksTextBlock"]
    risk_text = clean_html(risk_block.html)
    print(f"事業リスク:\n{risk_text}")

# MD&A（経営者による財政状態の分析）
if "ManagementAnalysisOfFinancialPositionOperatingResultsAndCashFlowsTextBlock" in section_map:
    mda = section_map["ManagementAnalysisOfFinancialPositionOperatingResultsAndCashFlowsTextBlock"]
    print(f"MD&A:\n{clean_html(mda.html)[:1000]}")
```

## セグメント分析 (extract_segments)

事業セグメント・地域セグメントなどのディメンション別データを抽出します。

```python
from edinet import extract_segments, list_dimension_axes

# 利用可能なディメンション軸を列挙
axes = list_dimension_axes(stmts)
for ax in axes:
    print(f"{ax.label_ja}: {ax.member_count} メンバー")
# => 事業セグメント: 5 メンバー
# => 地域: 4 メンバー

# セグメント別データを抽出
segments = extract_segments(stmts)

for seg in segments:
    print(f"\n=== {seg.segment_label} ===")
    for item in seg.items:
        print(f"  {item.label_ja.text}: {item.value:,}")
# => === 自動車事業 ===
# =>   売上高: 8,000,000,000,000
# =>   営業利益: 600,000,000,000
# => === 金融事業 ===
# =>   売上高: 2,500,000,000,000
# =>   ...
```

## 計算バリデーション (validate_calculations)

計算リンクベースを用いて財務諸表の合計値を検証します。

```python
from edinet import validate_calculations

# 損益計算書の計算を検証
pl = stmts.income_statement()
result = validate_calculations(pl, stmts.calculation_linkbase)

print(result)
# => 計算バリデーション: 合格 (検証=15, 合格=15, エラー=0, スキップ=3)

if not result.is_valid:
    for issue in result.issues:
        if issue.severity == "error":
            print(f"エラー: {issue.parent_concept}")
            print(f"  期待値: {issue.expected:,}")
            print(f"  実際値: {issue.actual:,}")
            print(f"  差額: {issue.difference:,}")
```

## 拡張科目検出 (detect_custom_items)

提出者別タクソノミ（企業固有の拡張科目）を検出し、親標準科目を推定します。

```python
from edinet import detect_custom_items, find_custom_concepts

# 拡張科目を検出
result = detect_custom_items(stmts)

print(f"総科目数: {result.total_count}")
print(f"標準科目: {result.standard_count}")
print(f"拡張科目: {result.custom_count}")
print(f"拡張科目率: {result.custom_ratio:.1%}")

# 拡張科目の詳細
for ci in result.custom_items[:5]:
    print(f"科目: {ci.item.label_ja.text}")
    print(f"  namespace: {ci.namespace_info.edinet_code}")
    if ci.parent_standard_concept:
        print(f"  親標準科目: {ci.parent_standard_concept}")

# 特定の concept が拡張科目かを確認
custom_concepts = find_custom_concepts(stmts)
# => frozenset({'CustomRevenueSegmentA', 'OtherOperatingExpensesDetail', ...})
```

## ディスクキャッシュ管理

XBRL ZIP のダウンロードをディスクにキャッシュし、再取得を高速化します。

```python
import edinet
from edinet import cache_info, clear_cache

# キャッシュを有効化（configure で設定）
edinet.configure(cache_dir="~/.cache/edinet")

# キャッシュ統計
info = cache_info()
print(f"キャッシュ有効: {info.enabled}")
print(f"エントリ数: {info.entry_count}")
print(f"合計サイズ: {info.total_bytes / 1024 / 1024:.1f} MB")

# Filing のキャッシュを強制リフレッシュ
xbrl_path, xbrl_bytes = filing.fetch(refresh=True)

# PDF もキャッシュ対応
pdf_bytes = filing.fetch_pdf()
with open("report.pdf", "wb") as f:
    f.write(pdf_bytes)

# キャッシュをクリア
clear_cache()
```

## データエクスポート

財務諸表を pandas DataFrame、CSV、Parquet、Excel にエクスポートします。

```python
# pandas DataFrame に変換
df = stmts.to_dataframe()
print(df.columns)
# => Index(['concept', 'local_name', 'label_ja', 'label_en', 'value',
#           'unit', 'decimals', 'period_type', 'start_date', 'end_date', ...])

# フィルタリング
pl_df = df[df["statement_type"] == "PL"]
numeric_df = df[df["value"].apply(lambda x: isinstance(x, Decimal))]

# CSV にエクスポート
stmts.to_csv("financial_statements.csv", encoding="utf-8-sig")

# Parquet にエクスポート（大量データ向け）
stmts.to_parquet("financial_statements.parquet")

# Excel にエクスポート
stmts.to_excel("financial_statements.xlsx")

# 損益計算書のみを DataFrame に
pl = stmts.income_statement()
pl_df = pl.to_dataframe()
```

## 書類概観サマリー (build_summary)

Filing の概観情報（会計基準、期間、科目数、セグメント数など）を構築します。

```python
from edinet import build_summary

summary = build_summary(stmts)

print(f"総科目数: {summary.total_items}")
print(f"会計基準: {summary.accounting_standard}")
print(f"期間: {summary.period_start} ~ {summary.period_end}")
print(f"期間種別: {summary.period_type}")  # FY / HY
print(f"連結データあり: {summary.has_consolidated}")
print(f"個別データあり: {summary.has_non_consolidated}")
print(f"標準科目: {summary.standard_item_count}")
print(f"拡張科目: {summary.custom_item_count}")
print(f"標準科目率: {summary.standard_item_ratio:.1%}")
print(f"セグメント数: {summary.segment_count}")

# 名前空間別の科目数
for ns_group, count in summary.namespace_counts.items():
    print(f"  {ns_group}: {count}")
```

## まとめ

edinet ライブラリは、日本の上場企業の財務データ分析・財務モデリング・定量分析・バックテストなどに活用できます。EDINET API への低レベルアクセスから、構造化された財務諸表の取得、会計基準横断の正規化キー抽出まで、様々なユースケースに対応します。ディスクキャッシュによる高速な再取得、非同期 API、pandas 連携により、大量データの効率的な処理が可能です。

典型的な統合パターンとして、(1) 定期的な開示書類の収集とデータベースへの蓄積、(2) 複数企業の財務指標比較分析、(3) 訂正報告書の差分検知とアラート、(4) セグメント別業績のトラッキング、(5) テキストマイニングによる定性情報の抽出などがあります。`configure()` での初期設定、`documents()` / `Company.search()` による書類・企業の特定、`filing.xbrl()` による XBRL 解析、`extract_values()` による正規化キー抽出というパイプラインが基本的なワークフローとなります。