# Apache Solr Apache Solr is a blazing-fast, open-source, multi-modal search platform built on Apache Lucene. It powers full-text, vector, and geospatial search at many of the world's largest organizations, providing enterprise-grade search capabilities with features like distributed indexing, replication, load-balanced querying, and automated failover and recovery. Solr provides a REST-like API for indexing and searching documents, with support for faceting, highlighting, spell-checking, and more. It can run in standalone mode for smaller deployments or in SolrCloud mode for distributed, highly-available deployments. The platform includes SolrJ, a Java client library, and supports communication via HTTP with JSON, XML, or binary formats. ## REST API - Search Query The `/select` endpoint handles search queries with support for full-text search, filtering, faceting, and highlighting. ```bash # Basic search query curl "http://localhost:8983/solr/myCollection/select?q=title:solr&rows=10" # Search with filter query and faceting curl "http://localhost:8983/solr/myCollection/select" \ -d "q=*:*" \ -d "fq=category:books" \ -d "facet=true" \ -d "facet.field=author" \ -d "facet.field=genre" \ -d "rows=20" \ -d "start=0" \ -d "fl=id,title,author,price" \ -d "sort=price asc" # JSON response example: # { # "responseHeader": {"status": 0, "QTime": 5}, # "response": { # "numFound": 125, # "start": 0, # "docs": [ # {"id": "978-0641723445", "title": "The Lightning Thief", "author": "Rick Riordan", "price": 12.50} # ] # }, # "facet_counts": { # "facet_fields": { # "author": ["Rick Riordan", 5, "Michael McCandless", 3] # } # } # } ``` ## REST API - Document Indexing The `/update` endpoint accepts documents for indexing in JSON, XML, or CSV formats with optional commit parameters. ```bash # Index a single document with JSON curl -X POST "http://localhost:8983/solr/myCollection/update?commit=true" \ -H "Content-Type: application/json" \ -d '[ { "id": "978-0641723445", "cat": ["book", "hardcover"], "name": "The Lightning Thief", "author": "Rick Riordan", "series_t": "Percy Jackson and the Olympians", "genre_s": "fantasy", "inStock": true, "price": 12.50, "pages_i": 384 } ]' # Index multiple documents with commitWithin (auto-commit within 1 second) curl -X POST "http://localhost:8983/solr/myCollection/update?commitWithin=1000" \ -H "Content-Type: application/json" \ -d '[ {"id": "doc1", "title": "First Document", "content": "Full text content here"}, {"id": "doc2", "title": "Second Document", "content": "More searchable content"} ]' # Delete documents by ID curl -X POST "http://localhost:8983/solr/myCollection/update?commit=true" \ -H "Content-Type: application/json" \ -d '{"delete": {"id": "978-0641723445"}}' # Delete documents by query curl -X POST "http://localhost:8983/solr/myCollection/update?commit=true" \ -H "Content-Type: application/json" \ -d '{"delete": {"query": "category:obsolete"}}' ``` ## REST API - JSON Query DSL The JSON Request API provides a structured way to build complex queries with facets, filters, and sorting. ```bash # Complex JSON query with facets and highlighting curl -X POST "http://localhost:8983/solr/myCollection/query" \ -H "Content-Type: application/json" \ -d '{ "query": "content:search engine", "filter": ["category:technology", "inStock:true"], "limit": 10, "offset": 0, "fields": ["id", "title", "author", "score"], "sort": "score desc, title asc", "facet": { "categories": { "type": "terms", "field": "category", "limit": 5 }, "price_ranges": { "type": "range", "field": "price", "start": 0, "end": 100, "gap": 20 } }, "highlight": { "fields": {"content": {}}, "pre": "", "post": "" } }' ``` ## REST API - Collection Admin The Collections API manages SolrCloud collections including creation, deletion, and modification operations. ```bash # Create a new collection with 3 shards and 2 replicas curl "http://localhost:8983/solr/admin/collections?action=CREATE&name=myNewCollection&numShards=3&replicationFactor=2&collection.configName=_default" # List all collections curl "http://localhost:8983/solr/admin/collections?action=LIST" # Get collection status curl "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS&collection=myCollection" # Delete a collection curl "http://localhost:8983/solr/admin/collections?action=DELETE&name=myOldCollection" # Reload a collection (to pick up config changes) curl "http://localhost:8983/solr/admin/collections?action=RELOAD&name=myCollection" # Create an alias pointing to a collection curl "http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=myAlias&collections=myCollection" ``` ## SolrJ Client - HttpJdkSolrClient HttpJdkSolrClient is a lightweight Java client using the built-in Java 11+ HTTP client for minimal dependencies. ```java import org.apache.solr.client.solrj.impl.HttpJdkSolrClient; import org.apache.solr.client.solrj.request.SolrQuery; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.common.SolrDocument; import org.apache.solr.common.SolrDocumentList; import org.apache.solr.common.SolrInputDocument; // Create a client connected to a standalone Solr instance try (HttpJdkSolrClient client = new HttpJdkSolrClient.Builder("http://localhost:8983/solr") .withDefaultCollection("myCollection") .withConnectionTimeout(5000) .withRequestTimeout(30000) .build()) { // Index a document SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", "book-001"); doc.addField("title", "Apache Solr Reference Guide"); doc.addField("author", "Apache Solr Team"); doc.addField("category", "technology"); doc.addField("price", 49.99); client.add(doc); client.commit(); // Execute a search query SolrQuery query = new SolrQuery("*:*"); query.setRows(10); query.setStart(0); query.addFilterQuery("category:technology"); query.setSort("price", SolrQuery.ORDER.asc); query.setFields("id", "title", "author", "price"); QueryResponse response = client.query(query); SolrDocumentList results = response.getResults(); System.out.println("Found " + results.getNumFound() + " documents"); for (SolrDocument result : results) { System.out.println("ID: " + result.getFieldValue("id")); System.out.println("Title: " + result.getFieldValue("title")); } // Delete by ID client.deleteById("book-001"); client.commit(); } ``` ## SolrJ Client - CloudSolrClient CloudSolrClient routes requests to the correct nodes in a SolrCloud cluster with automatic failover. ```java import org.apache.solr.client.solrj.impl.CloudSolrClient; import org.apache.solr.client.solrj.request.CollectionAdminRequest; import org.apache.solr.client.solrj.response.CollectionAdminResponse; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.common.SolrInputDocument; import java.util.Arrays; import java.util.List; // Connect to SolrCloud using ZooKeeper List zkHosts = Arrays.asList("zk1:2181", "zk2:2181", "zk3:2181"); try (CloudSolrClient client = new CloudSolrClient.Builder(zkHosts) .withZkChroot("/solr") .withDefaultCollection("myCollection") .build()) { // Create a new collection CollectionAdminRequest.Create createRequest = CollectionAdminRequest.createCollection("newCollection", "_default", 2, 2); CollectionAdminResponse createResponse = createRequest.process(client); // Index documents - automatically routed to correct shards SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField("id", "user-123"); doc1.addField("name", "John Doe"); doc1.addField("email", "john@example.com"); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField("id", "user-456"); doc2.addField("name", "Jane Smith"); doc2.addField("email", "jane@example.com"); client.add(Arrays.asList(doc1, doc2)); client.commit(); // Query across all shards SolrQuery query = new SolrQuery("name:*"); query.setRows(100); QueryResponse response = client.query(query); // Get document by ID (real-time get) SolrDocument doc = client.getById("user-123"); } // Alternative: Connect using Solr URLs instead of ZooKeeper try (CloudSolrClient client = new CloudSolrClient.Builder( Arrays.asList("http://solr1:8983/solr", "http://solr2:8983/solr")) .withDefaultCollection("myCollection") .build()) { // Use client... } ``` ## SolrJ Client - SolrQuery Builder SolrQuery provides a fluent API for building complex search queries with facets, highlighting, and more. ```java import org.apache.solr.client.solrj.SolrClient; import org.apache.solr.client.solrj.request.SolrQuery; import org.apache.solr.client.solrj.response.FacetField; import org.apache.solr.client.solrj.response.QueryResponse; // Build a comprehensive search query SolrQuery query = new SolrQuery(); // Main query and filters query.setQuery("content:search AND content:engine"); query.addFilterQuery("type:article"); query.addFilterQuery("date:[2023-01-01T00:00:00Z TO NOW]"); query.addFilterQuery("-status:draft"); // Pagination and sorting query.setStart(0); query.setRows(25); query.setSort("relevance", SolrQuery.ORDER.desc); query.addSort("date", SolrQuery.ORDER.desc); // Field selection query.setFields("id", "title", "author", "date", "score"); // Faceting configuration query.setFacet(true); query.addFacetField("category"); query.addFacetField("author"); query.setFacetLimit(10); query.setFacetMinCount(1); query.addFacetQuery("price:[0 TO 25]"); query.addFacetQuery("price:[25 TO 50]"); query.addFacetQuery("price:[50 TO *]"); // Highlighting configuration query.setHighlight(true); query.addHighlightField("content"); query.addHighlightField("title"); query.setHighlightSimplePre(""); query.setHighlightSimplePost(""); query.setHighlightSnippets(3); query.setHighlightFragsize(150); // Execute query QueryResponse response = client.query(query); // Process facet results for (FacetField facet : response.getFacetFields()) { System.out.println("Facet: " + facet.getName()); for (FacetField.Count count : facet.getValues()) { System.out.println(" " + count.getName() + ": " + count.getCount()); } } // Access highlighting Map>> highlighting = response.getHighlighting(); for (SolrDocument doc : response.getResults()) { String id = (String) doc.getFieldValue("id"); Map> docHighlights = highlighting.get(id); if (docHighlights != null && docHighlights.containsKey("content")) { System.out.println("Highlighted: " + docHighlights.get("content").get(0)); } } ``` ## SolrJ Client - JsonQueryRequest JsonQueryRequest enables building queries using the JSON Request API format for complex structured queries. ```java import org.apache.solr.client.solrj.request.json.JsonQueryRequest; import org.apache.solr.client.solrj.request.json.TermsFacetMap; import org.apache.solr.client.solrj.request.json.RangeFacetMap; import org.apache.solr.client.solrj.response.QueryResponse; import java.util.HashMap; import java.util.Map; // Build a JSON query request JsonQueryRequest jsonQuery = new JsonQueryRequest() .setQuery("*:*") .setLimit(20) .setOffset(0) .returnFields("id", "title", "author", "price", "score"); // Add filter queries jsonQuery.withFilter("category:books"); jsonQuery.withFilter("inStock:true"); // Add terms facet TermsFacetMap authorFacet = new TermsFacetMap("author") .setLimit(10) .setMinCount(1); jsonQuery.withFacet("top_authors", authorFacet); // Add range facet RangeFacetMap priceFacet = new RangeFacetMap("price", 0, 100, 25); jsonQuery.withFacet("price_ranges", priceFacet); // Add nested sub-facet Map categoryFacet = new HashMap<>(); categoryFacet.put("type", "terms"); categoryFacet.put("field", "category"); categoryFacet.put("limit", 5); Map subFacet = new HashMap<>(); subFacet.put("type", "terms"); subFacet.put("field", "author"); subFacet.put("limit", 3); categoryFacet.put("facet", Map.of("top_author_per_category", subFacet)); jsonQuery.withFacet("categories_with_authors", categoryFacet); // Execute and process response QueryResponse response = jsonQuery.process(client, "myCollection"); System.out.println("Found: " + response.getResults().getNumFound()); ``` ## CLI - Solr Commands The `bin/solr` command-line interface manages Solr instances, collections, and provides administrative operations. ```bash # Start Solr in standalone mode bin/solr start -p 8983 # Start Solr in SolrCloud mode with embedded ZooKeeper bin/solr start -c -p 8983 # Start SolrCloud connecting to external ZooKeeper bin/solr start -c -z zk1:2181,zk2:2181,zk3:2181/solr # Start with specific memory settings bin/solr start -m 4g # Check Solr status bin/solr status # Stop Solr bin/solr stop -p 8983 bin/solr stop -all # Create a collection bin/solr create -c myCollection -n _default -shards 2 -replicationFactor 2 # Create a core (standalone mode) bin/solr create_core -c myCore -d _default # Delete a collection or core bin/solr delete -c myCollection # Post documents to Solr bin/solr post -c myCollection /path/to/documents.json bin/solr post -c myCollection /path/to/data/*.xml bin/solr post -c myCollection -filetypes json,xml /path/to/files/ # Export collection data bin/solr export -c myCollection -query "*:*" -out /path/to/export.json # Health check bin/solr healthcheck -c myCollection -z localhost:2181 # Authenticate (if security enabled) bin/solr auth enable -type basicAuth -credentials admin:password ``` ## Docker Deployment Solr provides official Docker images for containerized deployments with support for SolrCloud clustering. ```bash # Run Solr standalone docker run -d -p 8983:8983 --name solr solr:latest # Run Solr with persistent data docker run -d -p 8983:8983 -v solr_data:/var/solr --name solr solr:latest # Create a core on startup docker run -d -p 8983:8983 --name solr solr:latest solr-precreate mycore # Run in demo mode (creates example collection) docker run -d -p 8983:8983 --name solr solr:latest solr-demo ``` ```yaml # docker-compose.yml for SolrCloud with ZooKeeper version: '3.8' services: zookeeper: image: zookeeper:3.9 ports: - "2181:2181" environment: ZOO_MY_ID: 1 ZOO_SERVERS: server.1=zookeeper:2888:3888;2181 solr1: image: solr:latest ports: - "8983:8983" environment: ZK_HOST: zookeeper:2181 depends_on: - zookeeper volumes: - solr1_data:/var/solr solr2: image: solr:latest ports: - "8984:8983" environment: ZK_HOST: zookeeper:2181 depends_on: - zookeeper volumes: - solr2_data:/var/solr volumes: solr1_data: solr2_data: ``` ## Schema Configuration The managed-schema.xml file defines field types, fields, and dynamic field patterns for document indexing. ```xml id ``` ## solrconfig.xml Configuration The solrconfig.xml file configures request handlers, caching, indexing settings, and other Solr behaviors. ```xml 10.0 100 native explicit _text_ 10 json true json 1024 true id 15000 10000 false 1000 ``` Apache Solr is the ideal choice for applications requiring powerful full-text search, faceted navigation, real-time indexing, and high availability. Common use cases include e-commerce product search, content management systems, log analytics, enterprise search portals, and any application where fast, relevant search results are critical. The platform supports multiple integration patterns: direct REST API calls for simple applications, SolrJ for Java applications requiring type safety and connection pooling, and Docker/Kubernetes deployments for cloud-native architectures. SolrCloud mode enables horizontal scaling with automatic sharding, leader election, and distributed queries across clusters of any size.