# gbizinfo gbizinfo is a Python client library for the gBizINFO REST API v2, providing access to Japanese corporate information (business activity data). The library supports all API endpoints including corporate search, retrieval by corporate number, and differential updates. It offers both synchronous (`GbizClient`) and asynchronous (`AsyncGbizClient`) clients with built-in features like automatic pagination, local file caching, retry with exponential backoff, and rate limiting. The library uses httpx for HTTP communication and Pydantic for data validation. It provides type-safe enum-based parameters for prefectures, corporate types, ministries, and other search criteria. Key features include automatic corporate number validation with check digit verification, transparent pagination for large result sets, and a `to_flat_dict()` method for converting nested API responses into flat dictionaries suitable for pandas DataFrames. ## Client Initialization Initialize a synchronous or asynchronous client with API token and optional configuration for caching, retry, rate limiting, and HTTP settings. ```python from gbizinfo import GbizClient, AsyncGbizClient from gbizinfo.config import CacheMode # Basic initialization (token from environment variable GBIZINFO_API_TOKEN) with GbizClient() as client: result = client.search(name="Toyota") # Full configuration client = GbizClient( api_token="YOUR_TOKEN", cache_dir="./cache", cache_mode=CacheMode.READ_WRITE, cache_ttl=60 * 60 * 12, # 12 hours retry_max_attempts=3, retry_base_delay=1.0, retry_cap_delay=16.0, rate_limit_per_sec=1.5, timeout=60.0, http2=True, proxy="http://my.proxy:8080", ) # Async client async with AsyncGbizClient(api_token="YOUR_TOKEN", max_concurrent=10) as client: result = await client.search(name="Sony") ``` ## Corporate Search Search for corporations using various criteria including name, prefecture, corporate type, capital, employee count, and more. Supports 30+ search parameters with type-safe enum values. ```python from gbizinfo import GbizClient from gbizinfo.enums import Prefecture, CorporateType, Ministry, Source with GbizClient(api_token="YOUR_TOKEN") as client: # Basic name search result = client.search(name="Toyota", limit=5) for item in result.items: print(f"{item.corporate_number}: {item.name}") # Output: 1180301018771: Toyota Motor Corporation # Search by prefecture and corporate type result = client.search( prefecture=Prefecture.東京都, corporate_type=CorporateType.株式会社, limit=10, ) # Multiple corporate types result = client.search( corporate_type=[CorporateType.株式会社, CorporateType.合同会社], prefecture=Prefecture.大阪府, ) # Capital and employee range result = client.search( capital_stock_from=100_000_000, capital_stock_to=500_000_000, employee_number_from=1000, prefecture=Prefecture.愛知県, ) # Filter by source and ministry result = client.search(source=Source.調達, ministry=Ministry.国税庁) # Subsidy and procurement keyword search result = client.search(subsidy="環境", prefecture=Prefecture.東京都) ``` ## Get Corporate Details by Corporate Number Retrieve detailed information for a specific corporation using its 13-digit corporate number. The library automatically validates the check digit. ```python from gbizinfo import GbizClient import gbizinfo with GbizClient(api_token="YOUR_TOKEN") as client: try: info = client.get("7000012050002") # National Tax Agency print(f"Name: {info.name}") print(f"Corporate Number: {info.corporate_number}") print(f"Location: {info.location}") print(f"Capital: {info.capital_stock}") print(f"Employees: {info.employee_number}") print(f"Established: {info.date_of_establishment}") except gbizinfo.GbizCorporateNumberError as e: print(f"Invalid corporate number: {e}") except gbizinfo.GbizNotFoundError as e: print(f"Corporation not found: {e}") ``` ## Get Sub-Resources Retrieve specific categories of information for a corporation including certification, commendation, finance, patent, procurement, subsidy, and workplace data. ```python from gbizinfo import GbizClient with GbizClient(api_token="YOUR_TOKEN") as client: corporate_number = "7000012050002" # Certifications and filings cert = client.get_certification(corporate_number) # Awards and commendations comm = client.get_commendation(corporate_number) # Financial information fin = client.get_finance(corporate_number) # Patent information pat = client.get_patent(corporate_number) # Procurement information proc = client.get_procurement(corporate_number) # Subsidy information sub = client.get_subsidy(corporate_number) # Workplace information work = client.get_workplace(corporate_number) ``` ## Differential Updates Retrieve corporations updated within a specified date range. Supports category-specific update endpoints. ```python from datetime import date, timedelta from gbizinfo import GbizClient with GbizClient(api_token="YOUR_TOKEN") as client: to_date = date.today() from_date = to_date - timedelta(days=3) # General update info result = client.get_update_info(from_date=from_date, to_date=to_date) print(f"Total count: {result.total_count}") print(f"Total pages: {result.total_page}") for item in result.items: print(f"{item.corporate_number}: {item.name}") # Category-specific updates client.get_update_certification(from_date=from_date, to_date=to_date) client.get_update_commendation(from_date=from_date, to_date=to_date) client.get_update_finance(from_date=from_date, to_date=to_date) client.get_update_procurement(from_date=from_date, to_date=to_date) client.get_update_workplace(from_date=from_date, to_date=to_date) ``` ## Automatic Pagination Transparently iterate through all pages of search results or update info. Automatically handles pagination up to 50,000 records (10 pages x 5,000 limit). ```python from datetime import date from gbizinfo import GbizClient from gbizinfo.enums import Prefecture import gbizinfo with GbizClient(api_token="YOUR_TOKEN") as client: # Paginate search results for item in client.paginate_search( prefecture=Prefecture.東京都, limit=2000 ): print(f"{item.corporate_number}: {item.name}") # Paginate update info for item in client.paginate_update_info( from_date=date(2025, 2, 1), to_date=date(2025, 2, 5), ): print(f"{item.corporate_number}: {item.update_date}") # Helper: get recent updates (past N days) try: for item in client.get_recent_updates(days=7): print(f"{item.corporate_number}: {item.name}") except gbizinfo.PaginationLimitExceededError as e: print(f"Too many results: {e.total_count} (max: {e.max_retrievable})") ``` ## Flatten Nested Data Convert nested API responses to flat dictionaries suitable for pandas DataFrames using the `to_flat_dict()` method. ```python from gbizinfo import GbizClient import pandas as pd with GbizClient(api_token="YOUR_TOKEN") as client: info = client.get("1180301018771") # List handling strategies flat = info.to_flat_dict(lists="count") # Lists as counts only flat = info.to_flat_dict(lists="first") # First element only flat = info.to_flat_dict(lists="json") # JSON string flat = info.to_flat_dict(lists="explode") # Indexed fields (_0, _1, ...) # Bulk flatten from SearchResult result = client.search(name="Toyota", limit=10) dicts = result.to_flat_dicts(lists="count") # Convert to pandas DataFrame df = pd.DataFrame(dicts) print(df.columns.tolist()) ``` ## Async Client Usage Use `AsyncGbizClient` for asynchronous operations with the same API as the synchronous client. ```python import asyncio from gbizinfo import AsyncGbizClient async def main(): async with AsyncGbizClient( api_token="YOUR_TOKEN", max_concurrent=10 # Limit concurrent requests ) as client: # Search result = await client.search(name="Sony", limit=3) for item in result.items: print(item.name) # Get by corporate number info = await client.get("1180301018771") print(info.name) # Async pagination async for item in client.paginate_search(name="Honda"): print(f"{item.corporate_number}: {item.name}") # Async recent updates async for item in client.get_recent_updates(days=3): print(f"{item.corporate_number}: {item.name}") asyncio.run(main()) ``` ## Error Handling Handle API errors using the exception hierarchy. All exceptions inherit from `GbizError`. ```python import gbizinfo from gbizinfo import GbizClient with GbizClient(api_token="YOUR_TOKEN") as client: try: info = client.get("7000012050002") except gbizinfo.GbizBadRequestError as e: # 400: Invalid parameters (includes errors array) print(f"Status: {e.context.status_code}") for error in e.errors: print(f"Error: {error}") except gbizinfo.GbizUnauthorizedError as e: # 401: Invalid token print(f"Unauthorized: {e.context.status_code}") except gbizinfo.GbizNotFoundError as e: # 404: Corporation not found print(f"Not found: {e.context.status_code}") except gbizinfo.GbizRateLimitError as e: # 429: Rate limit exceeded print(f"Rate limited, retry after: {e.context.retry_after}s") except gbizinfo.GbizServerError as e: # 5xx: Server error print(f"Server error: {e.context.status_code}") except gbizinfo.GbizTransportError as e: # Network connection error print(f"Transport error: {e.original}") except gbizinfo.GbizTimeoutError as e: # Timeout error print(f"Timeout ({e.timeout_type}): {e}") except gbizinfo.GbizValidationError as e: # Pre-request validation error print(f"Validation error: {e}") except gbizinfo.GbizCorporateNumberError as e: # Invalid corporate number format print(f"Invalid corporate number: {e}") except gbizinfo.PaginationLimitExceededError as e: # Pagination limit exceeded print(f"Too many results: {e.total_count}") ``` ## Enums for Type-Safe Parameters Use enums for type-safe parameter specification with IDE completion support. ```python from gbizinfo.enums import ( Prefecture, CorporateType, Region, Ministry, Source, AverageAge, AverageContinuousServiceYears, MonthAverageOvertimeHours, FemaleWorkersProportion, BusinessItem, PatentClassification, ) # Prefecture enum (47 prefectures) Prefecture.東京都 == "13" # True - StrEnum subclass # Corporate type CorporateType.株式会社 # "301" # Region with prefectures Region.関東.prefectures # Tuple of prefectures in Kanto region # Workplace information enums AverageAge.歳30以下 AverageContinuousServiceYears.年21以上 MonthAverageOvertimeHours.時間20未満 FemaleWorkersProportion.割合61以上 # Patent classification PatentClassification.食品_食料品 ``` ## Summary The gbizinfo library provides comprehensive access to Japan's gBizINFO corporate information API. Primary use cases include corporate research and due diligence, building company databases, tracking government procurement and subsidies, analyzing corporate financial data, and monitoring patent portfolios. The library is particularly useful for applications that need to process large volumes of corporate data, with automatic pagination handling up to 50,000 records per query. Integration patterns typically involve using the synchronous client for simple scripts and CLI tools, while the async client is better suited for web applications and high-throughput data pipelines. The caching feature reduces API calls for repeated queries, and the `to_flat_dict()` method simplifies data analysis workflows with pandas. For production deployments, configure appropriate retry settings, rate limiting, and error handling to ensure robust operation against the gBizINFO API's known limitations and rate restrictions.