LangChain Community (langchain-ai/langchain-community)

LangChain Community

https://github.com/langchain-ai/langchain-community
Admin
Community-maintained LangChain integrations

Tokens:8,177
Snippets:112
Trust Score:9.2
Update:1 week ago
Show doc for...
Context Summary (auto-generated)
Raw
# LangChain Community

LangChain Community is a collection of third-party integrations that implement the base interfaces defined in LangChain Core. This package provides ready-to-use components for building LLM-powered applications, including chat models, language models, embeddings, vector stores, document loaders, retrievers, and tools. The library enables developers to connect their applications to numerous external services and APIs through a unified interface.

The package is designed to work seamlessly within the LangChain ecosystem, providing plug-and-play integrations with over 100 different providers including cloud platforms (AWS, Azure, Google Cloud), AI services (OpenAI, Anthropic, Hugging Face), databases (PostgreSQL, MongoDB, Redis), and specialized tools (search engines, APIs, file systems). All components follow consistent patterns defined by LangChain Core, making it easy to swap implementations without changing application logic.

## Installation

```bash
pip install langchain-community
```

## Chat Models

Chat models provide conversational interfaces to various LLM providers, exposing a message-based API for multi-turn conversations.

```python
from langchain_community.chat_models import ChatOllama
from langchain_core.messages import HumanMessage, SystemMessage

# Initialize chat model with Ollama
chat = ChatOllama(
    model="llama2",
    base_url="http://localhost:11434",
    temperature=0.7
)

# Create messages for conversation
messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="What is the capital of France?")
]

# Get response
response = chat.invoke(messages)
print(response.content)
# Output: The capital of France is Paris.

# Streaming response
for chunk in chat.stream(messages):
    print(chunk.content, end="", flush=True)
```

## Language Models (LLMs)

LLM classes provide direct access to text completion models with a "text in, text out" interface.

```python
from langchain_community.llms import Ollama, HuggingFaceEndpoint

# Using Ollama for local LLM
llm = Ollama(
    model="llama2",
    base_url="http://localhost:11434",
    temperature=0.8,
    num_ctx=2048,
    num_predict=128
)

# Simple text generation
response = llm.invoke("Explain quantum computing in simple terms.")
print(response)

# Using HuggingFace Inference Endpoint
hf_llm = HuggingFaceEndpoint(
    repo_id="meta-llama/Llama-2-7b-chat-hf",
    huggingfacehub_api_token="your-token-here",
    temperature=0.5,
    max_new_tokens=512
)

response = hf_llm.invoke("What are the benefits of renewable energy?")
print(response)
```

## Embeddings

Embedding models convert text into vector representations for similarity search and retrieval applications.

```python
from langchain_community.embeddings import (
    HuggingFaceEmbeddings,
    OllamaEmbeddings,
    OpenAIEmbeddings
)

# HuggingFace local embeddings
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs={'device': 'cpu'},
    encode_kwargs={'normalize_embeddings': True}
)

# Embed single text
text = "LangChain is a framework for building LLM applications."
vector = embeddings.embed_query(text)
print(f"Vector dimension: {len(vector)}")  # Output: Vector dimension: 384

# Embed multiple documents
documents = [
    "Machine learning is transforming industries.",
    "Natural language processing enables computers to understand text.",
    "Vector databases store embeddings efficiently."
]
vectors = embeddings.embed_documents(documents)
print(f"Embedded {len(vectors)} documents")

# Ollama embeddings
ollama_embeddings = OllamaEmbeddings(
    model="nomic-embed-text",
    base_url="http://localhost:11434"
)
vector = ollama_embeddings.embed_query("Hello world")
```

## Vector Stores

Vector stores enable storage and similarity search over embedded documents. The FAISS integration is one of the most popular choices for local vector storage.

```python
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_core.documents import Document

# Initialize embeddings
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

# Create documents
documents = [
    Document(page_content="Python is a programming language", metadata={"topic": "programming"}),
    Document(page_content="Machine learning uses algorithms to learn from data", metadata={"topic": "ml"}),
    Document(page_content="Deep learning is a subset of machine learning", metadata={"topic": "ml"}),
    Document(page_content="JavaScript runs in the browser", metadata={"topic": "programming"})
]

# Create vector store from documents
vectorstore = FAISS.from_documents(documents, embeddings)

# Similarity search
results = vectorstore.similarity_search("What is deep learning?", k=2)
for doc in results:
    print(f"Content: {doc.page_content}")
    print(f"Metadata: {doc.metadata}\n")

# Similarity search with scores
results_with_scores = vectorstore.similarity_search_with_score("programming languages", k=2)
for doc, score in results_with_scores:
    print(f"Score: {score:.4f} - {doc.page_content}")

# Save and load vector store
vectorstore.save_local("faiss_index")
loaded_vectorstore = FAISS.load_local(
    "faiss_index",
    embeddings,
    allow_dangerous_deserialization=True
)

# Search with metadata filter
results = vectorstore.similarity_search(
    "algorithms",
    k=2,
    filter={"topic": "ml"}
)
```

## Document Loaders

Document loaders read data from various sources and convert them into Document objects for processing.

```python
from langchain_community.document_loaders import (
    TextLoader,
    PyPDFLoader,
    CSVLoader,
    DirectoryLoader,
    WebBaseLoader,
    JSONLoader
)

# Load text file
text_loader = TextLoader("document.txt", encoding="utf-8", autodetect_encoding=True)
documents = text_loader.load()
print(f"Loaded {len(documents)} document(s)")

# Load PDF file
pdf_loader = PyPDFLoader("report.pdf")
pages = pdf_loader.load()
for page in pages:
    print(f"Page {page.metadata['page']}: {page.page_content[:100]}...")

# Load CSV file
csv_loader = CSVLoader(
    file_path="data.csv",
    csv_args={"delimiter": ",", "quotechar": '"'},
    source_column="source"
)
csv_docs = csv_loader.load()

# Load all files from directory
dir_loader = DirectoryLoader(
    "./documents",
    glob="**/*.txt",
    loader_cls=TextLoader,
    show_progress=True
)
all_docs = dir_loader.load()

# Load web page
web_loader = WebBaseLoader(
    web_paths=["https://example.com/article"],
    bs_kwargs={"parse_only": None}
)
web_docs = web_loader.load()

# Load JSON with jq schema
json_loader = JSONLoader(
    file_path="data.json",
    jq_schema=".messages[]",
    content_key="content",
    metadata_func=lambda record, metadata: {**metadata, "author": record.get("author")}
)
json_docs = json_loader.load()
```

## Retrievers

Retrievers return documents given a text query, providing flexible retrieval strategies beyond simple vector similarity.

```python
from langchain_community.retrievers import (
    BM25Retriever,
    WikipediaRetriever,
    TavilySearchAPIRetriever
)
from langchain_core.documents import Document

# BM25 keyword-based retriever
documents = [
    Document(page_content="Python is great for data science"),
    Document(page_content="JavaScript is used for web development"),
    Document(page_content="Rust is known for memory safety"),
    Document(page_content="Python has excellent machine learning libraries")
]

bm25_retriever = BM25Retriever.from_documents(documents, k=2)
results = bm25_retriever.invoke("Python programming")
for doc in results:
    print(doc.page_content)

# Wikipedia retriever
wiki_retriever = WikipediaRetriever(
    top_k_results=3,
    lang="en",
    doc_content_chars_max=4000
)
wiki_docs = wiki_retriever.invoke("Artificial Intelligence history")
for doc in wiki_docs:
    print(f"Title: {doc.metadata.get('title')}")
    print(f"Summary: {doc.page_content[:200]}...\n")

# Tavily search retriever (requires API key)
tavily_retriever = TavilySearchAPIRetriever(
    k=5,
    include_generated_answer=True,
    include_raw_content=False
)
search_results = tavily_retriever.invoke("latest AI developments 2024")
```

## Tools

Tools enable agents to interact with external services and APIs. Each tool has a description that helps agents decide when to use it.

```python
from langchain_community.tools import (
    DuckDuckGoSearchRun,
    WikipediaQueryRun,
    ShellTool,
    TavilySearchResults
)
from langchain_community.utilities import WikipediaAPIWrapper

# DuckDuckGo search tool
search = DuckDuckGoSearchRun()
result = search.invoke("LangChain framework features")
print(result)

# Wikipedia tool
wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper(
    top_k_results=1,
    doc_content_chars_max=1000
))
wiki_result = wikipedia.invoke("Machine Learning")
print(wiki_result)

# Tavily search tool (requires TAVILY_API_KEY env var)
tavily_tool = TavilySearchResults(
    max_results=5,
    include_answer=True,
    include_raw_content=True,
    search_depth="advanced"
)
tavily_results = tavily_tool.invoke({"query": "quantum computing breakthroughs"})
for result in tavily_results:
    print(f"URL: {result.get('url')}")
    print(f"Content: {result.get('content')[:200]}...\n")

# Shell tool (use with caution)
shell = ShellTool()
output = shell.invoke({"commands": ["echo 'Hello from shell'", "date"]})
print(output)
```

## File Management Tools

Tools for reading, writing, and managing files in the filesystem.

```python
from langchain_community.tools.file_management import (
    ReadFileTool,
    WriteFileTool,
    ListDirectoryTool,
    CopyFileTool,
    MoveFileTool,
    FileSearchTool
)

# Read file
read_tool = ReadFileTool()
content = read_tool.invoke({"file_path": "example.txt"})
print(content)

# Write file
write_tool = WriteFileTool()
write_tool.invoke({
    "file_path": "output.txt",
    "text": "This is the content to write."
})

# List directory contents
list_tool = ListDirectoryTool()
files = list_tool.invoke({"dir_path": "./documents"})
print(files)

# Search for files
search_tool = FileSearchTool()
matches = search_tool.invoke({
    "dir_path": "./",
    "pattern": "*.py"
})
print(matches)
```

## SQL Database Toolkit

The SQL Database Toolkit enables agents to query and interact with SQL databases.

```python
from langchain_community.agent_toolkits.sql.toolkit import SQLDatabaseToolkit
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_openai import ChatOpenAI

# Connect to database
db = SQLDatabase.from_uri("sqlite:///chinook.db")
print(f"Dialect: {db.dialect}")
print(f"Tables: {db.get_usable_table_names()}")

# Initialize toolkit with LLM for query checking
llm = ChatOpenAI(model="gpt-4", temperature=0)
toolkit = SQLDatabaseToolkit(db=db, llm=llm)

# Get available tools
tools = toolkit.get_tools()
for tool in tools:
    print(f"Tool: {tool.name}")
    print(f"Description: {tool.description[:100]}...\n")

# Use with an agent
from langgraph.prebuilt import create_react_agent

agent_executor = create_react_agent(
    llm,
    toolkit.get_tools(),
    state_modifier="You are a SQL expert. Help users query the database."
)

# Query the agent
response = agent_executor.invoke({
    "messages": [("user", "What are the top 5 customers by total purchase amount?")]
})
print(response["messages"][-1].content)
```

## Gmail Toolkit

Tools for interacting with Gmail to search, read, and send emails.

```python
from langchain_community.agent_toolkits import GmailToolkit
from langchain_community.tools.gmail import (
    GmailSearch,
    GmailGetMessage,
    GmailSendMessage,
    GmailCreateDraft
)

# Initialize toolkit (requires OAuth credentials)
toolkit = GmailToolkit()
tools = toolkit.get_tools()

# Search emails
search_tool = GmailSearch()
emails = search_tool.invoke({
    "query": "from:important@example.com after:2024/01/01",
    "max_results": 10
})

# Get specific message
get_message_tool = GmailGetMessage()
message = get_message_tool.invoke({"message_id": "abc123xyz"})
print(f"Subject: {message['subject']}")
print(f"From: {message['sender']}")

# Create draft
draft_tool = GmailCreateDraft()
draft_tool.invoke({
    "message": "Hello, this is a test email.",
    "to": ["recipient@example.com"],
    "subject": "Test Subject"
})

# Send message
send_tool = GmailSendMessage()
send_tool.invoke({
    "message": "Hello, this is the email body.",
    "to": ["recipient@example.com"],
    "subject": "Important Update"
})
```

## Playwright Browser Toolkit

Tools for web browser automation using Playwright.

```python
from langchain_community.agent_toolkits import PlayWrightBrowserToolkit
from langchain_community.tools.playwright import (
    NavigateTool,
    ExtractTextTool,
    ExtractHyperlinksTool,
    ClickTool,
    CurrentWebPageTool
)

# Initialize toolkit with async browser
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        toolkit = PlayWrightBrowserToolkit.from_browser(
            async_browser=browser
        )
        tools = toolkit.get_tools()

        # Navigate to page
        navigate = NavigateTool.from_browser(async_browser=browser)
        await navigate.ainvoke({"url": "https://example.com"})

        # Extract text
        extract_text = ExtractTextTool.from_browser(async_browser=browser)
        text = await extract_text.ainvoke({})
        print(f"Page text: {text[:500]}...")

        # Get current page info
        current_page = CurrentWebPageTool.from_browser(async_browser=browser)
        page_info = await current_page.ainvoke({})
        print(f"Current URL: {page_info}")

        # Extract links
        extract_links = ExtractHyperlinksTool.from_browser(async_browser=browser)
        links = await extract_links.ainvoke({})
        print(f"Found {len(links)} links")

        await browser.close()

import asyncio
asyncio.run(main())
```

## Callbacks and Utilities

Utilities for working with callbacks, caching, and other supporting functionality.

```python
from langchain_community.callbacks import StreamlitCallbackHandler
from langchain_community.cache import InMemoryCache, SQLiteCache
from langchain_core.globals import set_llm_cache

# Set up caching for LLM calls
set_llm_cache(InMemoryCache())

# Or use SQLite for persistent caching
set_llm_cache(SQLiteCache(database_path=".langchain.db"))

# Streamlit callback for UI integration
import streamlit as st

st_callback = StreamlitCallbackHandler(st.container())

# Use in LLM calls
from langchain_community.llms import Ollama

llm = Ollama(model="llama2", callbacks=[st_callback])
response = llm.invoke("Explain neural networks")
```

## Creating Custom Tools

Build custom tools by extending BaseTool or using the @tool decorator.

```python
from langchain_core.tools import BaseTool, tool
from pydantic import BaseModel, Field
from typing import Type, Optional

# Using @tool decorator
@tool
def calculate_area(length: float, width: float) -> float:
    """Calculate the area of a rectangle given length and width."""
    return length * width

result = calculate_area.invoke({"length": 5.0, "width": 3.0})
print(f"Area: {result}")  # Output: Area: 15.0

# Using BaseTool class for more control
class WeatherInput(BaseModel):
    city: str = Field(description="The city to get weather for")
    units: str = Field(default="celsius", description="Temperature units: celsius or fahrenheit")

class WeatherTool(BaseTool):
    name: str = "get_weather"
    description: str = "Get current weather for a city"
    args_schema: Type[BaseModel] = WeatherInput

    def _run(self, city: str, units: str = "celsius") -> str:
        # Implement actual weather API call here
        return f"Weather in {city}: 22 degrees {units}"

    async def _arun(self, city: str, units: str = "celsius") -> str:
        return self._run(city, units)

weather_tool = WeatherTool()
result = weather_tool.invoke({"city": "London", "units": "celsius"})
print(result)  # Output: Weather in London: 22 degrees celsius
```

## Summary

LangChain Community provides a comprehensive suite of integrations that enable developers to build sophisticated LLM-powered applications. The library's modular architecture allows for easy composition of components - combining document loaders with text splitters, embeddings with vector stores, and tools with agents to create end-to-end workflows. Key use cases include building RAG (Retrieval-Augmented Generation) systems with document loaders and vector stores, creating intelligent agents with tool access, developing chatbots with various LLM backends, and automating tasks through API integrations.

Integration patterns typically involve initializing components with provider-specific credentials, composing them into chains or agents, and leveraging callbacks for monitoring and debugging. The consistent interfaces across all integrations mean that switching between providers (e.g., from OpenAI to Ollama for embeddings) requires minimal code changes. For production deployments, consider using the dedicated partner packages (like `langchain-openai`, `langchain-anthropic`) which offer better maintenance and updates compared to the community versions of popular integrations.