# Agent Skills for Microsoft AI SDKs

Agent Skills is a template repository of modular knowledge packages designed to supercharge AI coding agents when working with Microsoft AI SDKs and Azure services. The repository provides skills, prompts, agents, and MCP server configurations that turn general-purpose coding agents (GitHub Copilot, Claude Code, etc.) into specialized experts with domain-specific knowledge about Azure AI SDKs, patterns, and best practices.

The core architecture follows a progressive disclosure model where skills expose minimal metadata for triggering, then load detailed instructions only when activated. This design efficiently manages context windows while providing comprehensive guidance for complex AI development tasks including Azure AI Search operations, Foundry agent development, MCP server creation, and real-time voice AI applications.

---

## Azure AI Project Client Authentication

The `AIProjectClient` provides access to Microsoft Foundry projects for agents, evaluations, and connections. Always use `DefaultAzureCredential` for production environments.

```python
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
import os

# Initialize with environment-based configuration
credential = DefaultAzureCredential()
client = AIProjectClient(
    endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
    credential=credential
)

# Use context manager for proper resource cleanup
async with client:
    # Create an agent
    agent = client.agents.create_version(
        agent_name="my-assistant",
        definition={
            "model": "gpt-4o-mini",
            "instructions": "You are a helpful assistant."
        }
    )
    print(f"Created agent: {agent.name} (version: {agent.version})")

# Environment variables required:
# AZURE_AI_PROJECT_ENDPOINT=https://<resource>.services.ai.azure.com/api/projects/<project>
# AZURE_AI_MODEL_DEPLOYMENT_NAME=gpt-4o-mini
```

---

## Azure AI Search Client Operations

The `SearchClient` enables querying indexes and document operations. Supports text search, vector search, and hybrid search with semantic ranking.

```python
from azure.identity import DefaultAzureCredential
from azure.search.documents import SearchClient
from azure.search.documents.models import VectorizedQuery
from azure.core.exceptions import ResourceNotFoundError, HttpResponseError

credential = DefaultAzureCredential()
endpoint = "https://my-search.search.windows.net"
index_name = "products"

client = SearchClient(endpoint, index_name, credential)

# Basic text search with filters
results = client.search(
    search_text="laptop computer",
    filter="category eq 'electronics' and price lt 1500",
    select=["id", "title", "description", "price"],
    top=10
)

for result in results:
    print(f"{result['title']} - ${result['price']}")

# Vector search with embeddings
embedding = await get_embedding("portable computing device")  # Your embedding function
vector_results = client.search(
    search_text=None,
    vector_queries=[VectorizedQuery(
        vector=embedding,
        k_nearest_neighbors=5,
        fields="content_vector"
    )]
)

# Hybrid search (vector + keyword + semantic)
hybrid_results = client.search(
    search_text="laptop",
    vector_queries=[VectorizedQuery(vector=embedding, k_nearest_neighbors=5, fields="content_vector")],
    query_type="semantic",
    semantic_configuration_name="my-semantic-config",
    top=10
)

# Document operations
try:
    doc = client.get_document(key="product-123")
    print(f"Found: {doc['title']}")
except ResourceNotFoundError:
    print("Document not found")
except HttpResponseError as e:
    print(f"Search error: {e.message}")
```

---

## Search Index Creation with Vector and Semantic Configuration

The `SearchIndexClient` manages index creation with support for vector search profiles, vectorizers, and semantic configurations required for agentic retrieval.

```python
from azure.identity import DefaultAzureCredential
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    SearchIndex, SearchField, VectorSearch, VectorSearchProfile,
    HnswAlgorithmConfiguration, AzureOpenAIVectorizer,
    AzureOpenAIVectorizerParameters, SemanticSearch,
    SemanticConfiguration, SemanticPrioritizedFields, SemanticField
)

credential = DefaultAzureCredential()
endpoint = "https://my-search.search.windows.net"
aoai_endpoint = "https://my-openai.openai.azure.com/"

index = SearchIndex(
    name="documents",
    fields=[
        SearchField(name="id", type="Edm.String", key=True, filterable=True),
        SearchField(name="title", type="Edm.String", searchable=True),
        SearchField(name="content", type="Edm.String", searchable=True),
        SearchField(name="category", type="Edm.String", filterable=True, facetable=True),
        SearchField(
            name="embedding",
            type="Collection(Edm.Single)",
            vector_search_dimensions=3072,
            vector_search_profile_name="vector-profile"
        ),
    ],
    vector_search=VectorSearch(
        profiles=[VectorSearchProfile(
            name="vector-profile",
            algorithm_configuration_name="hnsw-algo",
            vectorizer_name="openai-vectorizer"
        )],
        algorithms=[HnswAlgorithmConfiguration(name="hnsw-algo")],
        vectorizers=[AzureOpenAIVectorizer(
            vectorizer_name="openai-vectorizer",
            parameters=AzureOpenAIVectorizerParameters(
                resource_url=aoai_endpoint,
                deployment_name="text-embedding-3-large",
                model_name="text-embedding-3-large"
            )
        )]
    ),
    semantic_search=SemanticSearch(
        default_configuration_name="semantic-config",
        configurations=[SemanticConfiguration(
            name="semantic-config",
            prioritized_fields=SemanticPrioritizedFields(
                title_field=SemanticField(field_name="title"),
                content_fields=[SemanticField(field_name="content")]
            )
        )]
    )
)

index_client = SearchIndexClient(endpoint, credential)
result = index_client.create_or_update_index(index)
print(f"Index '{result.name}' created/updated successfully")
```

---

## Agentic Retrieval with Knowledge Bases

The `KnowledgeBaseRetrievalClient` enables LLM-powered Q&A with automatic query planning and answer synthesis over your search indexes.

```python
from azure.identity import DefaultAzureCredential
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    SearchIndexKnowledgeSource, SearchIndexKnowledgeSourceParameters,
    SearchIndexFieldReference, KnowledgeBase, KnowledgeBaseAzureOpenAIModel,
    KnowledgeSourceReference, AzureOpenAIVectorizerParameters,
    KnowledgeRetrievalOutputMode
)
from azure.search.documents.knowledgebases import KnowledgeBaseRetrievalClient
from azure.search.documents.knowledgebases.models import (
    KnowledgeBaseRetrievalRequest, KnowledgeBaseMessage,
    KnowledgeBaseMessageTextContent, SearchIndexKnowledgeSourceParams,
    KnowledgeRetrievalLowReasoningEffort
)

credential = DefaultAzureCredential()
endpoint = "https://my-search.search.windows.net"
aoai_endpoint = "https://my-openai.openai.azure.com/"

# Step 1: Create Knowledge Source (points to existing index)
index_client = SearchIndexClient(endpoint, credential)

knowledge_source = SearchIndexKnowledgeSource(
    name="docs-source",
    description="Documentation knowledge source",
    search_index_parameters=SearchIndexKnowledgeSourceParameters(
        search_index_name="documents",
        source_data_fields=[
            SearchIndexFieldReference(name="id"),
            SearchIndexFieldReference(name="title"),
            SearchIndexFieldReference(name="category")
        ]
    )
)
index_client.create_or_update_knowledge_source(knowledge_source)

# Step 2: Create Knowledge Base (wraps sources + LLM)
knowledge_base = KnowledgeBase(
    name="docs-kb",
    models=[KnowledgeBaseAzureOpenAIModel(
        azure_open_ai_parameters=AzureOpenAIVectorizerParameters(
            resource_url=aoai_endpoint,
            deployment_name="gpt-4o-mini",
            model_name="gpt-4o-mini"
        )
    )],
    knowledge_sources=[KnowledgeSourceReference(name="docs-source")],
    output_mode=KnowledgeRetrievalOutputMode.ANSWER_SYNTHESIS,
    answer_instructions="Provide concise answers with citations."
)
index_client.create_or_update_knowledge_base(knowledge_base)

# Step 3: Query the Knowledge Base
kb_client = KnowledgeBaseRetrievalClient(endpoint, "docs-kb", credential)

messages = [
    KnowledgeBaseMessage(
        role="user",
        content=[KnowledgeBaseMessageTextContent(text="How do I configure vector search?")]
    )
]

request = KnowledgeBaseRetrievalRequest(
    messages=messages,
    knowledge_source_params=[SearchIndexKnowledgeSourceParams(
        knowledge_source_name="docs-source",
        include_references=True,
        include_reference_source_data=True
    )],
    retrieval_reasoning_effort=KnowledgeRetrievalLowReasoningEffort,
    include_activity=True
)

result = kb_client.retrieve(retrieval_request=request)

# Extract answer
for resp in result.response:
    for content in resp.content:
        print(f"Answer: {content.text}")

# Extract citations
if result.references:
    print("\nSources:")
    for ref in result.references:
        print(f"  - {ref.source_data.get('title', 'Unknown')} (score: {ref.reranker_score:.2f})")
```

---

## Foundry Agent with MCP Knowledge Base Tool

Create Microsoft Foundry agents connected to knowledge bases via MCP (Model Context Protocol) for enterprise-grade agentic retrieval with citations.

```python
import os
import requests
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import PromptAgentDefinition, MCPTool

# Configuration
SEARCH_ENDPOINT = os.environ["AZURE_SEARCH_ENDPOINT"]
PROJECT_ENDPOINT = os.environ["FOUNDRY_PROJECT_ENDPOINT"]
PROJECT_RESOURCE_ID = os.environ["PROJECT_RESOURCE_ID"]
KB_NAME = "enterprise-kb"
API_VERSION = "2025-11-01-preview"

credential = DefaultAzureCredential()

# Step 1: Create Project Connection (one-time setup)
mcp_endpoint = f"{SEARCH_ENDPOINT}/knowledgebases/{KB_NAME}/mcp?api-version={API_VERSION}"
mgmt_token = get_bearer_token_provider(credential, "https://management.azure.com/.default")()

connection_response = requests.put(
    f"https://management.azure.com{PROJECT_RESOURCE_ID}/connections/kb-connection?api-version=2025-10-01-preview",
    headers={"Authorization": f"Bearer {mgmt_token}", "Content-Type": "application/json"},
    json={
        "name": "kb-connection",
        "properties": {
            "authType": "ProjectManagedIdentity",
            "category": "RemoteTool",
            "target": mcp_endpoint,
            "isSharedToAll": True,
            "audience": "https://search.azure.com/",
            "metadata": {"ApiType": "Azure"}
        }
    }
)
connection_response.raise_for_status()

# Step 2: Create Agent with MCP Tool
client = AIProjectClient(endpoint=PROJECT_ENDPOINT, credential=credential)

AGENT_INSTRUCTIONS = """You are a helpful assistant that must use the knowledge base to answer all questions.
You must never answer from your own knowledge under any circumstances.
Every answer must provide annotations as: 【message_idx:search_idx†source_name】
If you cannot find the answer, respond with "I don't know"."""

mcp_tool = MCPTool(
    server_label="knowledge-base",
    server_url=mcp_endpoint,
    require_approval="never",
    allowed_tools=["knowledge_base_retrieve"],
    project_connection_id="kb-connection"
)

agent = client.agents.create_version(
    agent_name="knowledge-assistant",
    definition=PromptAgentDefinition(
        model="gpt-4o-mini",
        instructions=AGENT_INSTRUCTIONS,
        tools=[mcp_tool]
    )
)
print(f"Agent created: {agent.name} (version: {agent.version})")

# Step 3: Query the Agent
openai_client = client.get_openai_client()
conversation = openai_client.conversations.create()

response = openai_client.responses.create(
    conversation=conversation.id,
    input="What are the best practices for configuring search indexes?",
    extra_body={"agent": {"name": agent.name, "type": "agent_reference"}}
)

print(f"Response: {response.output_text}")
```

---

## Real-Time Voice AI with Azure AI Voice Live SDK

The `azure-ai-voicelive` SDK enables real-time bidirectional audio communication with Azure AI for voice assistants, chatbots, and speech-to-speech applications.

```python
import asyncio
import base64
from azure.ai.voicelive.aio import connect
from azure.core.credentials import AzureKeyCredential
from azure.ai.voicelive.models import RequestSession, FunctionTool

async def voice_assistant():
    async with connect(
        endpoint="https://eastus.api.cognitive.microsoft.com",
        credential=AzureKeyCredential("<your-api-key>"),
        model="gpt-4o-realtime-preview"
    ) as conn:
        # Configure session with VAD and function tools
        await conn.session.update(session=RequestSession(
            instructions="You are a helpful voice assistant. Be concise and friendly.",
            modalities=["text", "audio"],
            voice="alloy",
            input_audio_format="pcm16",
            output_audio_format="pcm16",
            turn_detection={
                "type": "server_vad",
                "threshold": 0.5,
                "silence_duration_ms": 500
            },
            tools=[FunctionTool(
                type="function",
                name="get_weather",
                description="Get current weather for a location",
                parameters={
                    "type": "object",
                    "properties": {"location": {"type": "string"}},
                    "required": ["location"]
                }
            )]
        ))

        # Simulate sending audio (in practice, stream from microphone)
        audio_chunk = b'\x00' * 4800  # 100ms of silence at 24kHz
        await conn.input_audio_buffer.append(audio=base64.b64encode(audio_chunk).decode())

        # Process events
        async for event in conn:
            match event.type:
                case "input_audio_buffer.speech_started":
                    print(f"User speaking at {event.audio_start_ms}ms")

                case "conversation.item.input_audio_transcription.completed":
                    print(f"User: {event.transcript}")

                case "response.audio.delta":
                    audio_bytes = base64.b64decode(event.delta)
                    # Play audio_bytes through speakers

                case "response.audio_transcript.delta":
                    print(event.delta, end="", flush=True)

                case "response.function_call_arguments.done":
                    # Handle function call
                    import json
                    result = {"temperature": "72°F", "condition": "Sunny"}
                    await conn.conversation.item.create(item={
                        "type": "function_call_output",
                        "call_id": event.call_id,
                        "output": json.dumps(result)
                    })
                    await conn.response.create()

                case "response.done":
                    print(f"\n[Response complete: {event.response.status}]")
                    break

                case "error":
                    print(f"Error: {event.error.message}")
                    break

asyncio.run(voice_assistant())
```

---

## MCP Server Creation (Python)

Create Model Context Protocol servers using FastMCP to enable LLMs to interact with external services through well-designed tools.

```python
#!/usr/bin/env python3
"""MCP Server for Example Service API integration."""

from typing import Optional, List
from enum import Enum
import httpx
from pydantic import BaseModel, Field, field_validator, ConfigDict
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("example_mcp")

API_BASE_URL = "https://api.example.com/v1"

class ResponseFormat(str, Enum):
    MARKDOWN = "markdown"
    JSON = "json"

class UserSearchInput(BaseModel):
    """Input model for user search operations."""
    model_config = ConfigDict(str_strip_whitespace=True, validate_assignment=True)

    query: str = Field(..., description="Search string (e.g., 'john', 'team:marketing')", min_length=2, max_length=200)
    limit: Optional[int] = Field(default=20, description="Max results (1-100)", ge=1, le=100)
    offset: Optional[int] = Field(default=0, description="Pagination offset", ge=0)
    response_format: ResponseFormat = Field(default=ResponseFormat.MARKDOWN, description="Output format")

    @field_validator('query')
    @classmethod
    def validate_query(cls, v: str) -> str:
        if not v.strip():
            raise ValueError("Query cannot be empty")
        return v.strip()

async def _make_api_request(endpoint: str, params: dict = None) -> dict:
    """Reusable API request function."""
    async with httpx.AsyncClient() as client:
        response = await client.get(f"{API_BASE_URL}/{endpoint}", params=params, timeout=30.0)
        response.raise_for_status()
        return response.json()

def _handle_api_error(e: Exception) -> str:
    """Consistent error formatting."""
    if isinstance(e, httpx.HTTPStatusError):
        status = e.response.status_code
        if status == 404: return "Error: Resource not found."
        if status == 403: return "Error: Permission denied."
        if status == 429: return "Error: Rate limit exceeded. Please wait."
        return f"Error: API request failed with status {status}"
    return f"Error: {type(e).__name__}: {str(e)}"

@mcp.tool(
    name="example_search_users",
    annotations={
        "title": "Search Users",
        "readOnlyHint": True,
        "destructiveHint": False,
        "idempotentHint": True,
        "openWorldHint": True
    }
)
async def example_search_users(params: UserSearchInput) -> str:
    """Search for users by name, email, or team.

    Returns user profiles matching the query with pagination support.
    Use 'team:name' syntax to filter by team membership.
    """
    try:
        data = await _make_api_request("users/search", {
            "q": params.query,
            "limit": params.limit,
            "offset": params.offset
        })

        users = data.get("users", [])
        if not users:
            return f"No users found matching '{params.query}'"

        if params.response_format == ResponseFormat.MARKDOWN:
            lines = [f"# Users matching '{params.query}'", f"Found {data.get('total', len(users))} results\n"]
            for user in users:
                lines.append(f"## {user['name']} ({user['id']})")
                lines.append(f"- **Email**: {user['email']}")
                if user.get('team'): lines.append(f"- **Team**: {user['team']}")
                lines.append("")
            return "\n".join(lines)
        else:
            import json
            return json.dumps({"total": data.get("total"), "users": users}, indent=2)

    except Exception as e:
        return _handle_api_error(e)

if __name__ == "__main__":
    mcp.run()
```

---

## MCP Server Creation (TypeScript)

Create MCP servers in TypeScript using the official SDK with Zod schemas for runtime validation.

```typescript
#!/usr/bin/env node
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
import axios, { AxiosError } from "axios";

const API_BASE_URL = "https://api.example.com/v1";

enum ResponseFormat {
  MARKDOWN = "markdown",
  JSON = "json"
}

const UserSearchSchema = z.object({
  query: z.string()
    .min(2, "Query must be at least 2 characters")
    .max(200)
    .describe("Search string to match against names/emails"),
  limit: z.number().int().min(1).max(100).default(20).describe("Maximum results"),
  offset: z.number().int().min(0).default(0).describe("Pagination offset"),
  response_format: z.nativeEnum(ResponseFormat).default(ResponseFormat.MARKDOWN)
}).strict();

type UserSearchInput = z.infer<typeof UserSearchSchema>;

function handleApiError(error: unknown): string {
  if (error instanceof AxiosError && error.response) {
    const { status } = error.response;
    if (status === 404) return "Error: Resource not found.";
    if (status === 403) return "Error: Permission denied.";
    if (status === 429) return "Error: Rate limit exceeded.";
    return `Error: API request failed with status ${status}`;
  }
  return `Error: ${error instanceof Error ? error.message : String(error)}`;
}

const server = new McpServer({ name: "example-mcp", version: "1.0.0" });

server.registerTool(
  "example_search_users",
  {
    title: "Search Users",
    description: "Search for users by name, email, or team with pagination support.",
    inputSchema: UserSearchSchema,
    annotations: {
      readOnlyHint: true,
      destructiveHint: false,
      idempotentHint: true,
      openWorldHint: true
    }
  },
  async (params: UserSearchInput) => {
    try {
      const { data } = await axios.get(`${API_BASE_URL}/users/search`, {
        params: { q: params.query, limit: params.limit, offset: params.offset },
        timeout: 30000
      });

      const users = data.users || [];
      if (!users.length) {
        return { content: [{ type: "text", text: `No users found matching '${params.query}'` }] };
      }

      const output = { total: data.total, count: users.length, users };

      let textContent: string;
      if (params.response_format === ResponseFormat.MARKDOWN) {
        const lines = [`# Users matching '${params.query}'`, `Found ${data.total} results\n`];
        for (const user of users) {
          lines.push(`## ${user.name} (${user.id})`, `- **Email**: ${user.email}`);
          if (user.team) lines.push(`- **Team**: ${user.team}`);
          lines.push("");
        }
        textContent = lines.join("\n");
      } else {
        textContent = JSON.stringify(output, null, 2);
      }

      return { content: [{ type: "text", text: textContent }], structuredContent: output };
    } catch (error) {
      return { content: [{ type: "text", text: handleApiError(error) }] };
    }
  }
);

async function main() {
  const transport = new StdioServerTransport();
  await server.connect(transport);
  console.error("MCP server running via stdio");
}

main().catch(console.error);
```

---

## Batch Document Upload with Buffered Sender

The `SearchIndexingBufferedSender` handles automatic batching, retries, and error handling for large document uploads.

```python
from azure.identity import DefaultAzureCredential
from azure.search.documents import SearchIndexingBufferedSender, SearchClient

credential = DefaultAzureCredential()
endpoint = "https://my-search.search.windows.net"
index_name = "products"

# Prepare documents
documents = [
    {"id": "1", "title": "Laptop", "category": "electronics", "price": 999.99},
    {"id": "2", "title": "Headphones", "category": "electronics", "price": 199.99},
    {"id": "3", "title": "Desk Chair", "category": "furniture", "price": 299.99},
    # ... potentially thousands more
]

# Batch upload with automatic batching and retries
with SearchIndexingBufferedSender(endpoint, index_name, credential) as sender:
    sender.upload_documents(documents)
    # Sender automatically flushes on context exit

print(f"Uploaded {len(documents)} documents")

# For individual operations, use SearchClient
client = SearchClient(endpoint, index_name, credential)

# Upsert (merge or upload)
client.merge_or_upload_documents([
    {"id": "1", "price": 899.99}  # Update price for existing doc
])

# Delete documents
client.delete_documents([{"id": "3"}])
```

---

## Skill Creation Guide

Skills are modular knowledge packages that extend AI agent capabilities. Each skill contains a `SKILL.md` with YAML frontmatter for triggering and markdown body for instructions.

```bash
# Initialize a new skill using the skill-creator scripts
python scripts/init_skill.py my-new-skill --path .github/skills/

# This creates:
# .github/skills/my-new-skill/
# ├── SKILL.md           # Main skill file with frontmatter
# ├── scripts/           # Executable code for deterministic tasks
# ├── references/        # Documentation loaded on-demand
# └── assets/            # Files used in output (templates, etc.)
```

```yaml
# Example SKILL.md structure
---
name: my-api-skill
description: Clean code patterns for My API SDK. Use when building applications that need to interact with My Service, including authentication, data operations, and webhook handling.
---

# My API SDK Patterns

## Authentication

```python
from myapi import Client
from azure.identity import DefaultAzureCredential

client = Client(credential=DefaultAzureCredential())
```

## Common Operations

### Create Resource
```python
resource = client.resources.create(name="example", config={...})
```

## Best Practices

1. Always use async/await for I/O operations
2. Use context managers for proper cleanup
3. Handle rate limits with exponential backoff
```

```bash
# Package the skill for distribution
python scripts/package_skill.py .github/skills/my-new-skill/

# Creates: my-new-skill.skill (zip archive)
```

---

## Summary

Agent Skills provides a comprehensive toolkit for building AI-powered applications with Microsoft Azure services. The primary use cases include creating intelligent search applications with Azure AI Search (text, vector, and hybrid search), building Foundry agents connected to enterprise knowledge bases via MCP tools for grounded Q&A with citations, developing real-time voice AI applications with bidirectional audio streaming, and creating custom MCP servers to extend LLM capabilities with external service integrations.

The integration patterns follow consistent conventions: use `DefaultAzureCredential` for authentication, prefer async/await for all I/O operations, leverage context managers for resource cleanup, and use `create_or_update_*` methods for idempotent operations. Skills can be composed and extended by creating new `SKILL.md` files with appropriate YAML frontmatter for triggering and markdown documentation that loads only when activated, ensuring efficient context window management while providing comprehensive guidance for complex AI development workflows.