### Workspace layout example Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Example of the directory structure after cloning the project and setting up OpenSearch and OpenSearch Dashboards. ```sh $ mkdir opensearch $ cd opensearch $ ls -la total 32 drwxr-xr-x 7 user group^users 4096 Nov 21 12:59 . drwxr-xr-x 19 user group^users 4096 Nov 21 09:44 .. drwxr-xr-x 10 user group^users 4096 Nov 8 12:16 opensearch drwxr-xr-x 14 user group^users 4096 Nov 8 12:14 opensearch-dashboards drwxr-xr-x 16 user group^users 4096 Nov 15 10:59 sql ``` -------------------------------- ### Build Core Module Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Example command to build only the core module. ```bash ./gradlew :core:build ``` -------------------------------- ### Running Individual Integration Tests Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Example command to run a specific integration test class. ```bash ./gradlew :integ-test:integTest -Dtests.class="*QueryIT" ``` -------------------------------- ### Running Task for Specific Module Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Example command to run a task for a specific module. ```bash ./gradlew ::task ``` -------------------------------- ### Basic PPL query with profile enabled Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/interfaces/endpoint.md This example shows how to run a PPL query and enable profiling to get detailed performance metrics. ```bash curl -sS -H 'Content-Type: application/json' \ -X POST localhost:9200/_plugins/_ppl \ -d '{ "profile": true, "query" : "source=accounts | fields firstname, lastname" }' ``` -------------------------------- ### Start and end range Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/bin.md This example demonstrates how to specify the start and end of the range for binning using the `start` and `end` parameters. The data will be binned within the range of 0 to 101. ```ppl source=accounts | bin age start=0 end=101 | fields age | head 1 ``` -------------------------------- ### Setting JAVA_HOME environment variable Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Example of how to check and set the JAVA_HOME environment variable. ```sh $ echo $JAVA_HOME /Library/Java/JavaVirtualMachines/jdk-21.jdk/Contents/Home $ java -version openjdk version "21.0.8" 2024-07-16 LTS OpenJDK Runtime Environment (build 21.0.8+13-LTS) OpenJDK 64-Bit Server VM (build 21.0.8+13-LTS, mixed mode, sharing) ``` -------------------------------- ### Explain query with default (standard) format Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/interfaces/endpoint.md Example of sending a POST request to the _plugins/_ppl/_explain endpoint to get the query execution plan in the default standard format. ```bash curl -sS -H 'Content-Type: application/json' \ -X POST localhost:9200/_plugins/_ppl/_explain \ -d '{"query" : "source=state_country | where age>30"}' ``` -------------------------------- ### Syntax Examples Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/multisearch.md Examples of the `multisearch` command syntax. ```syntax | multisearch [search source=table | where condition1] [search source=table | where condition2] | multisearch [search source=index1 | fields field1, field2] [search source=index2 | fields field1, field2] | multisearch [search source=table | where status="success"] [search source=table | where status="error"] ``` -------------------------------- ### CURTIME Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Example of using the CURTIME() function to get the current time. ```sql SELECT CURTIME() as value_1, CURTIME() as value_2 ``` -------------------------------- ### CURDATE Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Example of using the CURDATE() function to get the current date. ```sql SELECT CURDATE() ``` -------------------------------- ### Example 1: Response time distribution from logs Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/bin.md This PPL query demonstrates how to use the bin command to analyze response time distribution from logs. ```ppl source=otellogs | rex field=body "(?\d+)ms" | bin duration span=100 | stats count() as request_count by duration | sort duration ``` -------------------------------- ### Datasource Read GET API Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/admin/datasources.md Example of a GET request to retrieve the configuration of a specific datasource. ```bash GET https://localhost:9200/_plugins/_query/_datasources/my_prometheus content-type: application/json Authorization: Basic {{username}} {{password}} ``` -------------------------------- ### LENGTH Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/string.md Example demonstrating the LENGTH function to get the length of a string. ```ppl source=people | eval `LENGTH('helloworld')` = LENGTH('helloworld') | fields `LENGTH('helloworld')` ``` -------------------------------- ### Skipping Prometheus Setup in Tests Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Command to bypass Prometheus setup and exclude Prometheus-specific tests. ```bash ./gradlew :integ-test:integTest -DignorePrometheus ``` -------------------------------- ### LENGTH Function Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Example demonstrating the LENGTH function to get the length of a string in bytes. ```sql SELECT LENGTH('helloworld') ``` -------------------------------- ### Explaining a PPL query in the v3 engine Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/explain.md This example demonstrates using the explain command when Apache Calcite is enabled, obtaining the logical and physical plan and pushdown information from the v3 engine. ```ppl explain source=state_country | where country = 'USA' OR country = 'England' | stats count() by country ``` ```json { "calcite": { "logical": "LogicalProject(count()=[$1], country=[$0])\n LogicalAggregate(group=[{1}], count()=[COUNT()])\n LogicalFilter(condition=[SEARCH($1, Sarg['England', 'USA':CHAR(7)]:CHAR(7))])\n CalciteLogicalIndexScan(table=[[OpenSearch, state_country]]) ", "physical": "EnumerableCalc(expr#0..1=[{inputs}], count()=[$t1], country=[$t0])\n CalciteEnumerableIndexScan(table=[[OpenSearch, state_country]], PushDownContext=[[FILTER->SEARCH($1, Sarg['England', 'USA':CHAR(7)]:CHAR(7)), AGGREGATION->rel#53:LogicalAggregate.NONE.[](input=RelSubset#43,group={1},count()=COUNT())], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"query\":{\"terms\":{\"country\":[\"England\",\"USA\"],\"boost\":1.0}},\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"country\":{\"terms\":{\"field\":\"country\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\"}}}]},\"aggregations\":{\"count()\":{\"value_count\":{\"field\":\"_index\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n" } } ``` -------------------------------- ### MONTHNAME Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Example of using the MONTHNAME function to get the full name of the month from a date. ```sql os> SELECT MONTHNAME(DATE('2020-08-26')) fetched rows / total rows = 1/1 +-------------------------------+ | MONTHNAME(DATE('2020-08-26')) | |-------------------------------| | August | +-------------------------------+ ``` -------------------------------- ### Run Unit Tests Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Run all unit tests. ```bash ./gradlew test ``` -------------------------------- ### MINUTE_OF_DAY Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Example of using the MINUTE_OF_DAY function to get the minute within a 24-hour day. ```sql os> SELECT MINUTE_OF_DAY((TIME '01:02:03')) fetched rows / total rows = 1/1 +----------------------------------+ | MINUTE_OF_DAY((TIME '01:02:03')) | |----------------------------------| | 62 | +----------------------------------+ ``` -------------------------------- ### Run PITest Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Run PITest. ```bash ./gradlew pitest ``` -------------------------------- ### Syntax Examples Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/streamstats.md Examples demonstrating the various ways to use the streamstats command, including parameters like current, window, global, reset_before, reset_after, and the by-clause. ```ppl source = table | streamstats avg(a) ``` ```ppl source = table | streamstats current = false avg(a) ``` ```ppl source = table | streamstats window = 5 sum(b) ``` ```ppl source = table | streamstats current = false window = 2 max(a) ``` ```ppl source = table | where a < 50 | streamstats count(c) ``` ```ppl source = table | streamstats min(c), max(c) by b ``` ```ppl source = table | streamstats count(c) as count_by by b | where count_by > 1000 ``` ```ppl source = table | streamstats dc(field) as distinct_count ``` ```ppl source = table | streamstats distinct_count(category) by region ``` ```ppl source = table | streamstats current=false window=2 global=false avg(a) by b ``` ```ppl source = table | streamstats window=2 reset_before=a>31 avg(b) ``` ```ppl source = table | streamstats current=false reset_after=a>31 avg(b) by c ``` -------------------------------- ### Lookup command syntax examples Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/lookup.md Examples demonstrating the various ways to use the lookup command syntax, including mapping fields and specifying replacement strategies. ```syntax source = table1 | lookup table2 id ``` ```syntax source = table1 | lookup table2 id, name ``` ```syntax source = table1 | lookup table2 id as cid, name ``` ```syntax source = table1 | lookup table2 id as cid, name replace dept as department ``` ```syntax source = table1 | lookup table2 id as cid, name replace dept as department, city as location ``` ```syntax source = table1 | lookup table2 id as cid, name append dept as department ``` ```syntax source = table1 | lookup table2 id as cid, name append dept as department, city as location ``` ```syntax source = table1 | lookup table2 id as cid, name output dept as department ``` ```syntax source = table1 | lookup table2 id as cid, name output dept as department, city as location ``` -------------------------------- ### Multiple start values Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/graphlookup.md Example of using graphLookup with multiple literal start values. ```ppl graphLookup employees start='Eliot', 'Andrew' edge=reportsTo-->name as reportingHierarchy ``` -------------------------------- ### Single start value Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/graphlookup.md Example of using graphLookup with a single literal start value. ```ppl graphLookup employees start='Eliot' edge=reportsTo-->name as reportingHierarchy ``` -------------------------------- ### Running Spotless Check and Apply Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Commands to check and apply code style using Spotless. ```bash ./gradlew spotlessCheck Runs Spotless to check for code style. ./gradlew spotlessApply Automatically apply spotless code style changes. ``` -------------------------------- ### MONTH and MONTH_OF_YEAR Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Example of using the MONTH and MONTH_OF_YEAR functions to get the month number from a date. ```sql os> SELECT MONTH(DATE('2020-08-26')) fetched rows / total rows = 1/1 +---------------------------+ | MONTH(DATE('2020-08-26')) | |---------------------------| | 8 | +---------------------------+ os> SELECT MONTH_OF_YEAR(DATE('2020-08-26')) fetched rows / total rows = 1/1 +-----------------------------------+ | MONTH_OF_YEAR(DATE('2020-08-26')) | |-----------------------------------| | 8 | +-----------------------------------+ ``` -------------------------------- ### MINUTE and MINUTE_OF_HOUR Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Example of using the MINUTE and MINUTE_OF_HOUR functions to get the minute value from a time. ```sql os> SELECT MINUTE(time('01:02:03')), MINUTE_OF_HOUR(time('01:02:03')) fetched rows / total rows = 1/1 +--------------------------+----------------------------------+ | MINUTE(time('01:02:03')) | MINUTE_OF_HOUR(time('01:02:03')) | |--------------------------|----------------------------------| | 2 | 2 | +--------------------------+ ``` -------------------------------- ### Log volume distribution with bins parameter Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/bin.md This example illustrates how to use the `bin` command with the `bins` parameter to create a log volume distribution. It first calculates the volume and then bins it into 4 buckets. ```ppl source=otellogs | stats count() as volume by `resource.attributes.service.name` | bin volume bins=4 | stats count() as service_count by volume | sort volume ``` -------------------------------- ### NOW Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/datetime.md Example of using the NOW function to get the current date and time. ```ppl source=people | eval `value_1` = NOW(), `value_2` = NOW() | fields `value_1`, `value_2` ``` -------------------------------- ### Describe Tables Like Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/metadata.rst This example shows how to use the DESCRIBE command to list columns in the 'accounts' table that match the pattern '%name'. ```sql os> DESCRIBE TABLES LIKE "accounts" COLUMNS LIKE "%name" ``` -------------------------------- ### LOCATE Function Examples Source: https://github.com/opensearch-project/sql/blob/main/integ-test/src/test/resources/correctness/expressions/text_functions.txt Examples of the LOCATE function, finding the position of a substring within a string, with an optional starting position. ```sql LOCATE('world', 'helloworld') as column LOCATE('world', 'hello') as column LOCATE('world', 'helloworld', 7) as column ``` -------------------------------- ### Explaining a PPL query in the simple mode Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/explain.md This example demonstrates using the explain command in the `simple` mode to display a simplified logical plan tree. ```ppl explain simple source=state_country | where country = 'USA' OR country = 'England' | stats count() by country ``` ```json { "calcite": { "logical": "LogicalProject\n LogicalAggregate\n LogicalFilter\n CalciteLogicalIndexScan\n" } } ``` -------------------------------- ### ASCII Function Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Example demonstrating the ASCII function to get the numeric value of the leftmost character of a string. ```sql SELECT ASCII('hello') ``` -------------------------------- ### Example 4 YAML format (experimental) Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/interfaces/endpoint.md Explain query in YAML format. ```bash curl -sS -H 'Content-Type: application/json' \ -X POST localhost:9200/_plugins/_ppl/_explain?format=yaml \ -d '{"query" : "source=state_country | where age>30"}' ``` ```yaml calcite: logical: | LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT]) LogicalProject(name=[$0], country=[$1], state=[$2], month=[$3], year=[$4], age=[$5]) LogicalFilter(condition=[>($5, 30)]) CalciteLogicalIndexScan(table=[[OpenSearch, state_country]]) physical: | CalciteEnumerableIndexScan(table=[[OpenSearch, state_country]], PushDownContext=[[PROJECT->[name, country, state, month, year, age], FILTER->>($5, 30), LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":10000,"timeout":"1m","query":{"range":{"age":{"from":30,"to":null,"include_lower":false,"include_upper":true,"boost":1.0}}},"_source":{"includes":["name","country","state","month","year","age"]}}, requestedTotalSize=10000, pageSize=null, startFrom=0)]) ``` -------------------------------- ### Explaining a PPL query in the v2 engine Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/explain.md This example shows how to use the explain command when Apache Calcite is disabled, obtaining the physical plan and pushdown information from the v2 engine. ```ppl explain source=state_country | where country = 'USA' OR country = 'England' | stats count() by country ``` ```json { "root": { "name": "ProjectOperator", "description": { "fields": "[count(), country]" }, "children": [ { "name": "OpenSearchIndexScan", "description": { "request": "OpenSearchQueryRequest(indexName=state_country, sourceBuilder={\"from\":0,\"size\":10000,\"timeout\":\"1m\",\"query\":{\"bool\":{\"should\":[{\"term\":{\"country\":{\"value\":\"USA\",\"boost\":1.0}}},{\"term\":{\"country\":{\"value\":\"England\",\"boost\":1.0}}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"country\":{\"terms\":{\"field\":\"country\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\"}}}]},\"aggregations\":{\"count()\":{\"value_count\":{\"field\":\"_index\"}}}}}}, pitId=null, cursorKeepAlive=null, searchAfter=null, searchResponse=null)" }, "children": [] } ] } } ``` -------------------------------- ### CURDATE() Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/datetime.md Example of using the CURDATE() function to get the current date. ```ppl source=people | eval `CURDATE()` = CURDATE() | fields `CURDATE()` ``` -------------------------------- ### QUARTER Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/datetime.md Example of using the QUARTER function to get the quarter of the year from a date. ```ppl source=people | eval `QUARTER(DATE('2020-08-26'))` = QUARTER(DATE('2020-08-26')) | fields `QUARTER(DATE('2020-08-26'))` ``` -------------------------------- ### Example 1 Results Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/bin.md The query returns the following results for response time distribution. ```text fetched rows / total rows = 4/4 +---------------+-------------+ | request_count | duration | |---------------+-------------| | 17 | null | | 1 | 0-100 | | 1 | 30000-30100 | | 1 | 3200-3300 | +---------------+-------------+ ``` -------------------------------- ### Clean and Rebuild Project Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Commands to clean the project and then rebuild it. ```bash $ ./gradlew clean $ ./gradlew build ``` -------------------------------- ### Fetching at Offset Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/basics.rst Shows how to use `LIMIT` with an offset to start fetching results from a specific position, useful for pagination. ```json POST /_plugins/_sql { "query" : """ SELECT account_number FROM accounts ORDER BY account_number LIMIT 1, 1 """ } ``` ```osql >od SELECT age FROM accounts ORDER BY age LIMIT 2 OFFSET 1 ``` -------------------------------- ### LOCATE Function Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Example demonstrating the LOCATE function to find the position of a substring within a string, with and without a starting position. ```sql SELECT LOCATE('world', 'helloworld'), LOCATE('world', 'helloworldworld', 7) ``` -------------------------------- ### YEARWEEK Function Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Example demonstrating the YEARWEEK function to get the year and week from a date, including an optional mode argument. ```sql SELECT YEARWEEK('2020-08-26'), YEARWEEK('2019-01-05', 0) ``` -------------------------------- ### Describe Tables Like Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/metadata.rst This example shows how to describe the fields for indices that match the pattern 'accounts'. ```sql os> DESCRIBE TABLES LIKE 'accounts' fetched rows / total rows = 11/11 +----------------+-------------+------------+----------------+-----------+-----------+-------------+---------------+----------------+----------------+----------+---------+------------+---------------+------------------+-------------------+------------------+-------------+---------------+--------------+-------------+------------------+------------------+--------------------+ | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB | CHAR_OCTET_LENGTH | ORDINAL_POSITION | IS_NULLABLE | SCOPE_CATALOG | SCOPE_SCHEMA | SCOPE_TABLE | SOURCE_DATA_TYPE | IS_AUTOINCREMENT | IS_GENERATEDCOLUMN | |----------------+-------------+------------+----------------+-----------+-----------+-------------+---------------+----------------+----------------+----------+---------+------------+---------------+------------------+-------------------+------------------+-------------+---------------+--------------+-------------+------------------+------------------+--------------------| | docTestCluster | null | accounts | account_number | null | long | null | null | null | 10 | 2 | null | null | null | null | null | 0 | | null | null | null | null | NO | | | docTestCluster | null | accounts | firstname | null | text | null | null | null | 10 | 2 | null | null | null | null | null | 1 | | null | null | null | null | NO | | | docTestCluster | null | accounts | address | null | text | null | null | null | 10 | 2 | null | null | null | null | null | 2 | | null | null | null | null | NO | | | docTestCluster | null | accounts | balance | null | long | null | null | null | 10 | 2 | null | null | null | null | null | 3 | | null | null | null | null | NO | | ``` -------------------------------- ### TO_DAYS Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/datetime.md Example demonstrating the usage of the TO_DAYS function to get the day number of a given date. ```ppl source=people | eval `TO_DAYS(DATE('2008-10-07'))` = TO_DAYS(DATE('2008-10-07')) | fields `TO_DAYS(DATE('2008-10-07'))` ``` -------------------------------- ### Complete PPL Query Workflow Example Source: https://github.com/opensearch-project/sql/blob/main/api/README.md A comprehensive example demonstrating the full workflow from creating a context to planning, transpiling, and executing a PPL query. ```java // Step 1: Create reusable context (shared across all components) try (UnifiedQueryContext context = UnifiedQueryContext.builder() .language(QueryType.PPL) .catalog("catalog", schema) .defaultNamespace("catalog") .build()) { // Step 2: Create planner with context UnifiedQueryPlanner planner = new UnifiedQueryPlanner(context); // Step 3: Plan PPL query into logical plan RelNode plan = planner.plan("source = employees | where age > 30"); // Option A: Transpile to target SQL UnifiedQueryTranspiler transpiler = UnifiedQueryTranspiler.builder() .dialect(SparkSqlDialect.DEFAULT) .build(); String sparkSql = transpiler.toSql(plan); // Result: SELECT * FROM `catalog`.`employees` WHERE `age` > 30 // Option B: Compile and execute directly UnifiedQueryCompiler compiler = new UnifiedQueryCompiler(context); try (PreparedStatement statement = compiler.compile(plan)) { ResultSet rs = statement.executeQuery(); while (rs.next()) { // Process results with standard JDBC } } } ``` -------------------------------- ### LOCATE Function Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/string.md Example demonstrating the usage of the LOCATE function to find the position of a substring within a string, with an optional start position. ```ppl source=people | eval `LOCATE('world', 'helloworld')` = LOCATE('world', 'helloworld'), `LOCATE('invalid', 'helloworld')` = LOCATE('invalid', 'helloworld'), `LOCATE('world', 'helloworld', 6)` = LOCATE('world', 'helloworld', 6) | fields `LOCATE('world', 'helloworld')`, `LOCATE('invalid', 'helloworld')`, `LOCATE('world', 'helloworld', 6)` ``` -------------------------------- ### Example Join Query for Context Rebuild Source: https://github.com/opensearch-project/sql/blob/main/docs/dev/opensearch-pagination.md An example of a join query that illustrates the context rebuild strategy. The 'fetch_size' parameter is shown, and the explanation discusses how the current implementation scrolls both tables block-by-block. ```json { "query": "SELECT a.name, b.name FROM A as a JOIN B as B on a.name = b.name" "fetch_size": 10 } ``` -------------------------------- ### Build All and Pass Checks Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Run the Gradle build, which includes all checks and tests. ```bash ./gradlew build ``` -------------------------------- ### LAST function example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/aggregations.md Example of using the LAST aggregation function to get the last non-null firstname grouped by gender. ```ppl source=accounts | stats last(firstname) by gender ``` -------------------------------- ### Basic minspan Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/bin.md This example demonstrates the basic usage of the `minspan` parameter in the `bin` command. The `minspan=5` parameter ensures that each bin has a minimum span of 5. ```ppl source=accounts | bin age minspan=5 | fields age, account_number | head 3 ``` -------------------------------- ### Binning with string fields Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/bin.md This example illustrates how to bin data based on a string field after casting a numerical field to a string and specifying the number of bins. ```ppl source=accounts | eval age_str = CAST(age AS STRING) | bin age_str bins=3 | stats count() by age_str | sort age_str ``` -------------------------------- ### LAST_DAY function example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/datetime.md Example demonstrating the usage of the LAST_DAY function to get the last day of the month for a given date. ```ppl source=people | eval `last_day('2023-02-06')` = last_day('2023-02-06') | fields `last_day('2023-02-06')` ``` -------------------------------- ### Specifying null value handling with usenull=true Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/top.md This example demonstrates how to use the `top` command to include null values by setting `usenull=true`. It shows the PPL query and the resulting output, including a row for 'null'. ```ppl source=otellogs | top usenull=true instrumentationScope.name ``` ```text fetched rows / total rows = 4/4 +-----------------------------------------------------------------------------+-------+ | instrumentationScope.name | count | |-----------------------------------------------------------------------------+-------| | null | 16 | | @opentelemetry/instrumentation-http | 2 | | Microsoft.Extensions.Hosting | 1 | | go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc | 1 | +-----------------------------------------------------------------------------+-------+ ``` -------------------------------- ### Sample test class for Doc Generator Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst An example Java test class demonstrating how to use the @DocTestConfig annotation and the Section, description, and example methods to generate documentation. ```java @DocTestConfig(template = "interfaces/protocol.rst", testData = {"accounts.json"}) public class ProtocolIT extends DocTest { @Section(1) public void test() { section( title("A New Section"), description( "Describe what is the use of new functionality." ), example( description("Describe what is the use case of this example to show"), post("SELECT ...") ) ); } } ``` -------------------------------- ### Block Comments Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/general/comments.md Demonstrates a block comment in PPL, which starts with /* and ends with */. ```ppl source=accounts | dedup 2 gender /* dedup the document with gender field keep 2 duplication */ | fields account_number, gender ``` ```text fetched rows / total rows = 3/3 +----------------+--------+ | account_number | gender | |----------------+--------| | 13 | F | | 1 | M | | 6 | M | +----------------+--------+ ``` -------------------------------- ### LOCALTIMESTAMP Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Illustrates the usage of the LOCALTIMESTAMP function to get the current timestamp. ```sql SELECT LOCALTIMESTAMP() ``` -------------------------------- ### Run all benchmarks Source: https://github.com/opensearch-project/sql/blob/main/benchmarks/README.md Command to execute all microbenchmarks from the project root directory. ```bash ./gradlew :benchmarks:jmh ``` -------------------------------- ### Run REST Integration Tests Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Run REST integration tests. ```bash ./gradlew :integ-test:yamlRestTest ``` -------------------------------- ### Example 2: Severity level distribution Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/bin.md This PPL query demonstrates how to use the bin command to analyze severity level distribution. ```ppl source=otellogs | bin severityNumber span=5 | stats count() as log_count by severityNumber | sort severityNumber ``` -------------------------------- ### WEEKDAY() Function Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Shows how to use the weekday() function to get the weekday index. ```sql SELECT weekday('2020-08-26'), weekday('2020-08-27') -- Result: -- +-----------------------+-----------------------+ -- | weekday('2020-08-26') | weekday('2020-08-27') | -- |-----------------------+-----------------------| -- | 2 | 3 | -- +-----------------------+-----------------------+ ``` -------------------------------- ### Explain query with simple mode Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/interfaces/endpoint.md Example of sending a POST request to the _plugins/_ppl/_explain endpoint with the 'simple' mode parameter. ```bash curl -sS -H 'Content-Type: application/json' \ -X POST localhost:9200/_plugins/_ppl/_explain?mode=simple \ -d '{"query" : "source=state_country | where age>30"}' ``` -------------------------------- ### DAYNAME Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Shows how to get the name of the weekday for a given date using DAYNAME. ```sql SELECT DAYNAME(DATE('2020-08-26')) fetched rows / total rows = 1/1 +-----------------------------+ | DAYNAME(DATE('2020-08-26')) | |-----------------------------| | Wednesday | +-----------------------------+ ``` -------------------------------- ### Build with Skipped Integration Tests Source: https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst Build locally faster by skipping integration tests. ```bash ./gradlew build -x integTest ``` -------------------------------- ### Basic bins parameter Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/bin.md This example demonstrates the basic usage of the `bins` parameter in the `bin` command. The `bins=5` parameter divides the data into 5 equal-sized bins. ```ppl source=time_test | bin value bins=5 | fields value | head 3 ``` -------------------------------- ### Cursor API Request and Response Examples Source: https://github.com/opensearch-project/sql/blob/main/docs/dev/opensearch-pagination.md This section provides examples of the request and response API for cursor queries, demonstrating how to create a cursor, fetch the next page, and clear the cursor state. ```json # 1.Creates a cursor POST _plugins/_sql?format=jdbc { "query": "SELECT * FROM accounts", "fetch_size": 5 } # Response { "status": 200, "total": 100, "size": 5, "schema": [...], "datarows": [...], "cursor": "cursorId" } # 2.Fetch next page by cursor provided in previous response POST _plugins/_sql?format=jdbc { "cursor": "cursorId" } # Response { "datarows": [...], "cursor": "cursorId" } # No cursor in the last page { "datarows": [...] } # 4.Clear the state forcibly earlier than last page be reached POST _plugins/_sql/close { "cursor": "cursorId" } # Response { "succeeded" : true } ``` -------------------------------- ### Line Comments Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/general/comments.md Demonstrates a line comment in PPL, which starts with // and ends at the new line. ```ppl source=accounts | top gender // finds most common gender of all the accounts ``` ```text fetched rows / total rows = 2/2 +--------+-------+ | gender | count | |--------+-------| | M | 3 | | F | 1 | +--------+-------+ ``` -------------------------------- ### Single-line Comments Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/general/comments.rst Demonstrates the usage of single-line comments starting with '#' or '--' in OpenSearch SQL. ```sql os> #comments ... SELECT ... -- comments ... 123; -- comments fetched rows / total rows = 1/1 +-----+ | 123 | |-----| | 123 | +-----+ ``` -------------------------------- ### Basic syntax Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/join.md Examples of the basic `join` command syntax. ```syntax source = table1 | inner join left = l right = r on l.a = r.a table2 | fields l.a, r.a, b, c source = table1 | inner join left = l right = r where l.a = r.a table2 | fields l.a, r.a, b, c source = table1 | left join left = l right = r on l.a = r.a table2 | fields l.a, r.a, b, c source = table1 | right join left = l right = r on l.a = r.a table2 | fields l.a, r.a, b, c source = table1 | full left = l right = r on l.a = r.a table2 | fields l.a, r.a, b, c source = table1 | cross join left = l right = r on 1=1 table2 source = table1 | left semi join left = l right = r on l.a = r.a table2 source = table1 | left anti join left = l right = r on l.a = r.a table2 source = table1 | join left = l right = r [ source = table2 | where d > 10 | head 5 ] source = table1 | inner join on table1.a = table2.a table2 | fields table1.a, table2.a, table1.b, table1.c source = table1 | inner join on a = c table2 | fields a, b, c, d source = table1 as t1 | join left = l right = r on l.a = r.a table2 as t2 | fields l.a, r.a source = table1 as t1 | join left = l right = r on l.a = r.a table2 as t2 | fields t1.a, t2.a source = table1 | join left = l right = r on l.a = r.a [ source = table2 ] as s | fields l.a, s.a ``` -------------------------------- ### String SUBSTRING with LEFT, LTRIM, LOWER Example Source: https://github.com/opensearch-project/sql/blob/main/integ-test/src/test/resources/correctness/tableau_integration_tests.txt Extracts a substring from 'OriginWeather' and checks if it starts with '$' after transformations. ```sql SELECT SUBSTRING(`opensearch_dashboards_sample_data_flights`.`OriginWeather`, 1, 1024) AS `OriginWeather` FROM `opensearch_dashboards_sample_data_flights` WHERE (LEFT(LTRIM(LOWER(SUBSTRING(`opensearch_dashboards_sample_data_flights`.`OriginWeather`, 1, 1024))), LENGTH('$')) = '$') GROUP BY 1 ``` -------------------------------- ### Clone documentation-website repository Source: https://github.com/opensearch-project/sql/blob/main/scripts/docs_exporter/README.md Clones the documentation-website repository to the same root directory as the sql repository. This is a one-time setup step. ```bash cd /path/to/sql/../ git clone https://github.com/opensearch-project/documentation-website.git ``` -------------------------------- ### Get Node Stats Source: https://github.com/opensearch-project/sql/blob/main/docs/user/admin/monitoring.rst This example shows how to retrieve node-level statistics for the SQL plugin using a curl command. ```sh curl -H "Content-Type: application/json" -X GET localhost:9200/_plugins/_sql/stats ``` -------------------------------- ### Syntax Examples Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/union.md Examples of the `union` command syntax. ```syntax | union logs-*, security-logs ``` ```syntax | union [search source=accounts | where age > 30], [search source=accounts | where age < 30] ``` ```syntax | union maxout=100 [search source=logs | fields user, action], [search source=events | fields user, action] ``` ```syntax | union [search source=accounts | where status="active"], [search source=accounts | where status="pending"] ``` -------------------------------- ### Wildcard prefix matching Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/replace.md This example demonstrates wildcard prefix matching where '*' matches any characters after a specific starting pattern. ```ppl source=accounts | replace "IL*" WITH "Illinois" IN state | fields state ``` -------------------------------- ### Push Project Into Query DSL Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/optimization/optimization.rst Shows how a Project list is pushed down to Query DSL to filter the source. ```bash sh$ curl -sS -H 'Content-Type: application/json' \ ... -X POST localhost:9200/_plugins/_sql/_explain \ ... -d '{"query" : "SELECT age FROM accounts"}' { "root": { "name": "ProjectOperator", "description": { "fields": "[age]" }, "children": [ { "name": "OpenSearchIndexScan", "description": { "request": "OpenSearchQueryRequest(indexName=accounts, sourceBuilder={\"from\":0,\"size\":10000,\"timeout\":\"1m\",\"_source\":{\"includes\":[\"age\"]}})" }, "children": [] } ] } } ``` -------------------------------- ### Field value examples Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/search.md Examples of field value syntax. ```ppl status=active code=ERR-401 ``` ```ppl email="user@example.com" message="server error" ``` -------------------------------- ### SYSDATE Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/functions.rst Shows how to get the current date and time with and without fractional seconds precision using SYSDATE. ```sql SELECT SYSDATE() as value_1, SYSDATE(6) as value_2; ``` -------------------------------- ### Specifying null value handling with usenull=false Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/top.md This example demonstrates how to use the `top` command to exclude null values by setting `usenull=false`. It shows the PPL query and the resulting output. ```ppl source=otellogs | top usenull=false instrumentationScope.name ``` ```text fetched rows / total rows = 3/3 +-----------------------------------------------------------------------------+-------+ | instrumentationScope.name | count | |-----------------------------------------------------------------------------+-------| | @opentelemetry/instrumentation-http | 2 | | Microsoft.Extensions.Hosting | 1 | | go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc | 1 | +-----------------------------------------------------------------------------+-------+ ``` -------------------------------- ### String SUBSTRING with GREATEST and FLOOR Example Source: https://github.com/opensearch-project/sql/blob/main/integ-test/src/test/resources/correctness/tableau_integration_tests.txt Extracts a substring from 'Origin' using GREATEST and FLOOR for dynamic start and length. ```sql SELECT IF(ISNULL(0), NULL, SUBSTRING(`opensearch_dashboards_sample_data_flights`.`Origin`,GREATEST(1,FLOOR(0)),FLOOR(5))) AS `Calculation_462181953493630977` FROM `opensearch_dashboards_sample_data_flights` GROUP BY 1 ``` -------------------------------- ### WEEKDAY Function Example Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/datetime.md Demonstrates the usage of the WEEKDAY function to get the weekday index. ```ppl source=people | eval `weekday(DATE('2020-08-26'))` = weekday(DATE('2020-08-26')) | eval `weekday(DATE('2020-08-27'))` = weekday(DATE('2020-08-27')) | fields `weekday(DATE('2020-08-26'))`, `weekday(DATE('2020-08-27'))` ``` -------------------------------- ### Daily span Source: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/cmd/bin.md This example demonstrates binning time-series data with a daily span. ```ppl source=time_test | bin @timestamp span=7day | fields @timestamp, value | head 3 ```