### Build and Start Hive with Ozone-backed Storage Source: https://github.com/apache/hive/blob/master/packaging/src/docker/README.md Cleans up previous containers, sets the path to the PostgreSQL driver, builds Hive and Hadoop, and starts Hive with Ozone integration. ```shell docker compose down --rmi local # cleanup previous containers and images export POSTGRES_LOCAL_PATH=... # set the path to the postgres driver jar ./build.sh -hive 4.2.0 -hadoop 3.4.1 -tez 0.10.5 ./start-hive.sh --ozone ``` -------------------------------- ### Install Hive Metastore Benchmarks Source: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-tools/metastore-benchmarks/README.md Use this Maven command to clean, install, and run the performance profile for the metastore benchmarks. ```bash mvn clean install -Pperf ``` -------------------------------- ### IMetaStoreClient Usage Example Source: https://context7.com/apache/hive/llms.txt This example demonstrates common operations using IMetaStoreClient, including creating databases and tables, adding partitions, listing metadata, and fetching table details. ```APIDOC ## IMetaStoreClient: Programmatic Metastore Access `IMetaStoreClient` provides a Java API to create, query, and manage Hive metadata (databases, tables, partitions, statistics) directly against the Hive Metastore over Thrift without going through HiveServer2. ### Example Usage ```java import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.metastore.*; import org.apache.hadoop.hive.metastore.api.*; import java.util.*; public class MetastoreClientExample { public static void main(String[] args) throws Exception { HiveConf conf = new HiveConf(); conf.set("hive.metastore.uris", "thrift://metastore-host:9083"); try (IMetaStoreClient client = new HiveMetaStoreClient(conf)) { // Create a database Database db = new Database("analytics", "Analytics database", "/user/hive/warehouse/analytics.db", null); if (!client.getAllDatabases().contains("analytics")) { client.createDatabase(db); } // Create a table programmatically Table tbl = new Table(); tbl.setDbName("analytics"); tbl.setTableName("page_views"); tbl.setTableType(TableType.MANAGED_TABLE.name()); StorageDescriptor sd = new StorageDescriptor(); sd.setCols(Arrays.asList( new FieldSchema("page_url", "string", "Page URL"), new FieldSchema("user_id", "bigint", "User identifier"), new FieldSchema("duration_ms","int", "Time on page in ms") )); sd.setInputFormat("org.apache.hadoop.hive.ql.io.orc.OrcInputFormat"); sd.setOutputFormat("org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat"); SerDeInfo serde = new SerDeInfo(); serde.setSerializationLib("org.apache.hadoop.hive.ql.io.orc.OrcSerde"); sd.setSerdeInfo(serde); sd.setLocation("/user/hive/warehouse/analytics.db/page_views"); tbl.setSd(sd); tbl.setPartitionKeys(List.of(new FieldSchema("event_date", "string", "Partition date"))); client.createTable(tbl); // Add a partition Partition part = new Partition(); part.setDbName("analytics"); part.setTableName("page_views"); part.setValues(List.of("2024-01-15")); StorageDescriptor partSd = sd.deepCopy(); partSd.setLocation("/user/hive/warehouse/analytics.db/page_views/event_date=2024-01-15"); part.setSd(partSd); client.add_partition(part); // List tables and partitions List tables = client.getTables("analytics", ".*"); System.out.println("Tables: " + tables); // [page_views] List partitions = client.listPartitions("analytics", "page_views", (short) 100); partitions.forEach(p -> System.out.println("Partition: " + p.getValues())); // Check table existence and fetch metadata boolean exists = client.tableExists("analytics", "page_views"); System.out.println("Table exists: " + exists); // true Table fetchedTbl = client.getTable("analytics", "page_views"); System.out.println("Table location: " + fetchedTbl.getSd().getLocation()); // Drop the table // client.dropTable("analytics", "page_views", true, true); } } } ``` ``` -------------------------------- ### Start Hive Services with Docker Compose Source: https://github.com/apache/hive/blob/master/packaging/src/docker/README.md Starts HiveServer2, Metastore, and PostgreSQL services using a docker-compose.yml file. This provides a quick overview and setup for both services. ```shell docker compose up -d ``` -------------------------------- ### Start Hive Metastore Source: https://github.com/apache/hive/blob/master/hcatalog/src/test/e2e/templeton/README.txt Starts the Hive metastore service on the specified port. This is a prerequisite for Templeton tests. ```bash ./bin/hive --service metastore -p 9933 ``` -------------------------------- ### Start Templeton Server Source: https://github.com/apache/hive/blob/master/hcatalog/src/test/e2e/templeton/README.txt Launches the Templeton server. Ensure this is running before executing tests. ```bash ./hcatalog/sbin/webhcat_server.sh start ``` -------------------------------- ### Run Example Hive Queries Source: https://github.com/apache/hive/blob/master/packaging/src/docker/README.md Execute a set of sample SQL queries to demonstrate basic Hive functionality. ```sql show tables; create table hive_example(a string, b int) partitioned by(c int); alter table hive_example add partition(c=1); insert into hive_example partition(c=1) values('a', 1), ('a', 2),('b',3); select count(distinct a) from hive_example; select sum(b) from hive_example; ``` -------------------------------- ### Configure and Start Hive with S3-backed Warehouse Source: https://github.com/apache/hive/blob/master/packaging/src/docker/README.md Sets environment variables for S3 access and Hive warehouse path, then starts the Docker Compose services. ```shell DEFAULT_FS="s3a://dw-team-bucket" \ HIVE_WAREHOUSE_PATH="/data/warehouse/tablespace/managed/hive" \ S3_ENDPOINT_URL="s3.us-west-2.amazonaws.com" \ docker-compose up ``` -------------------------------- ### Build and Start Hive LLAP Cluster with Docker Compose Source: https://github.com/apache/hive/blob/master/packaging/src/docker/README.md Use this workflow to clean up previous containers, build a new Hive image, and start the cluster with LLAP daemons enabled. ```shell docker-compose down --rmi local # cleanup previous containers and images export POSTGRES_LOCAL_PATH=... # set the path to the postgres driver jar on the host machine ./build.sh -hive 4.2.0 -hadoop 3.4.1 -tez 0.10.5 # build image from the common Dockerfile ./start-hive.sh --llap ``` -------------------------------- ### Start Services with Docker Compose Source: https://github.com/apache/hive/blob/master/packaging/src/docker/thirdparties/polaris/README.md Launches the Hive and Polaris services defined in the docker-compose.yml file. ```shell docker-compose up -d ``` -------------------------------- ### Launch HiveServer2 with Embedded Metastore Source: https://github.com/apache/hive/blob/master/packaging/src/docker/README.md Starts a Docker container running HiveServer2 with an embedded Derby Metastore. This is suitable for quick setups and testing. ```shell docker run -d -p 10000:10000 -p 10002:10002 --env SERVICE_NAME=hiveserver2 --name hive4 apache/hive:${HIVE_VERSION} ``` -------------------------------- ### Run Hive Metastore Benchmarks Source: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-tools/metastore-benchmarks/README.md After setting the HMS_HOST environment variable, use this Maven command to install and run the performance profile. ```bash mvn install -Pperf ``` -------------------------------- ### Initialize Hive System Schemas with Schematool Source: https://github.com/apache/hive/blob/master/packaging/src/docker/README.md Commands to download the PostgreSQL JDBC driver, start the Docker Compose services, and then use schematool to initialize system schemas for HiveServer2. ```shell mvn dependency:get -Dartifact=org.postgresql:postgresql:42.7.5 docker compose up -d docker compose exec hiveserver2-standalone /bin/bash /opt/hive/bin/schematool -initSchema -dbType hive -metaDbType postgres -url jdbc:hive2://localhost:10000/default exit ``` -------------------------------- ### Run HiveServer2 with Remote Metastore Source: https://github.com/apache/hive/blob/master/packaging/src/docker/README.md Starts a HiveServer2 instance configured to connect to a remote Metastore service. Options for resuming and verbose logging are included. ```shell docker run -d -p 10000:10000 -p 10002:10002 --env SERVICE_NAME=hiveserver2 \ --env SERVICE_OPTS="-Dhive.metastore.uris=thrift://metastore:9083" \ --env IS_RESUME="true" \ --env VERBOSE="true" \ --name hiveserver2-standalone apache/hive:${HIVE_VERSION} ``` -------------------------------- ### Launch Standalone Metastore Source: https://github.com/apache/hive/blob/master/packaging/src/docker/README.md Starts a Docker container for a standalone Metastore using Derby as the database. This is useful when you need a separate Metastore service. ```shell docker run -d -p 9083:9083 --env SERVICE_NAME=metastore --name metastore-standalone apache/hive:${HIVE_VERSION} ``` -------------------------------- ### Beeline CLI: Connect to HiveServer2 Source: https://context7.com/apache/hive/llms.txt Examples of connecting to HiveServer2 using the Beeline CLI with different authentication methods and transport protocols, including basic, Kerberos, and HTTP. ```bash # Connect to HiveServer2 (basic) beeline -u "jdbc:hive2://hs2-host:10000/default" -n hiveuser -p secret # Connect with Kerberos (no password needed) beeline -u "jdbc:hive2://hs2-host:10000/default;principal=hive/hs2-host@REALM.COM" # Connect via HTTP transport beeline -u "jdbc:hive2://hs2-host:10001/default;transportMode=http;httpPath=cliservice" \ -n hiveuser -p secret ``` -------------------------------- ### Date Calculations in DB2 Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/offline/select_db2.out.txt This example shows how to perform date calculations in DB2, including adding and subtracting days from a given date. It utilizes a subquery with `sysibm.sysdummy1` for demonstration purposes. ```sql select cd, cd + inc days, cd - inc days + coalesce(inc, 0) days from (select date '2015-09-02' as cd, 3 as inc from sysibm.sysdummy1) ``` -------------------------------- ### HPL/SQL Function Definition Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/local/create_function4.out.txt Defines a function named GET in HPL/SQL. This is the starting point for creating reusable code blocks. ```hplsql CREATE FUNCTION GET ``` -------------------------------- ### Run All Metastore Benchmarks Source: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-tools/metastore-benchmarks/README.md Execute all available benchmark tests with default settings. Replace `metastore_db_name` and `hostname` with your specific values. ```java java -jar hmsbench.jar -d `metastore_db_name` -H `hostname` ``` -------------------------------- ### Show All Partitions Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/db/part_count.out.txt Use this command to display all partitions associated with a given table. ```sql SHOW PARTITIONS partition_date_1 ``` -------------------------------- ### Show Partitions for Another Table Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/db/part_count.out.txt Demonstrates showing partitions for a different table, illustrating the general syntax for partition inspection. ```sql SHOW PARTITIONS partition_date_1a ``` -------------------------------- ### Get Next ID with Exclusive Locks Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/offline/select_db2.out.txt This snippet demonstrates how to get the next available ID from a sequence table, ensuring exclusive locks to prevent concurrent modifications. It handles the case where the table might be empty by defaulting to 0. ```sql select coalesce(max(info_id)+1,0) into NextID from sproc_info with rr use and keep exclusive locks ``` -------------------------------- ### Declare and Set Variables in HPL/SQL Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/local/mult_div.out.txt Demonstrates how to declare integer variables and assign them initial values. Use this for setting up variables before performing operations. ```HPLSQL DECLARE a int = 8 DECLARE b int = 4 DECLARE c int = 2 ``` -------------------------------- ### Export Hive Version Source: https://github.com/apache/hive/blob/master/packaging/src/docker/thirdparties/polaris/README.md Set the HIVE_VERSION environment variable before starting the services. ```shell export HIVE_VERSION=4.2.0-SNAPSHOT ``` -------------------------------- ### Get New Aggregation Buffer Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMax.txt Instantiates a new aggregation buffer for the UDAF. This is called internally by Hive. ```Java @Override public AggregationBuffer getNewAggregationBuffer() throws HiveException { return new Aggregation(); } ``` -------------------------------- ### Get Right Dynamic Value Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/ExpressionTemplates/FilterColumnBetweenDynamicValue.txt Accessor method to retrieve the right dynamic value used for filtering. ```Java public DynamicValue getRightDynamicValue() { return rightDynamicValue; } ``` -------------------------------- ### Configure Hadoop Options for Kerberos Source: https://github.com/apache/hive/blob/master/hcatalog/src/test/e2e/templeton/README.txt Add Hadoop options to conf/hadoop-env.sh for Kerberos realm and KDC configuration. ```bash export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk" ``` -------------------------------- ### Get Left Dynamic Value Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/ExpressionTemplates/FilterColumnBetweenDynamicValue.txt Accessor method to retrieve the left dynamic value used for filtering. ```Java public DynamicValue getLeftDynamicValue() { return leftDynamicValue; } ``` -------------------------------- ### Reset Aggregation Buffer Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMax.txt Resets the aggregation buffer to its initial state. This is called when starting a new aggregation or partition. ```Java @Override public void reset(AggregationBuffer agg) throws HiveException { Aggregation myAgg = (Aggregation) agg; myAgg.reset(); } ``` -------------------------------- ### Create Database with IF NOT EXISTS and Comment Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/db/create_drop_database.out.txt Create a database only if it does not already exist, and optionally add a comment and specify a location. This prevents errors if the database already exists and allows for metadata. ```sql create database if not exists test1 comment 'abc' location '/users' ``` -------------------------------- ### Set Current Schema to Default Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/db/set_current_schema.out.txt Use the `USE` statement to set the current schema. This example sets it to 'default'. ```SQL USE default ``` -------------------------------- ### Configure Metastore Benchmark Test Parameters Source: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-tools/metastore-benchmarks/README.md Run benchmark tests with custom parameters, such as the number of objects created (`-N`), warm-up iterations (`-W`), and exclusions for specific operations (`-E`). ```java java -jar hmsbench.jar -d `metastore_db_name` -H `hostname` -N 500 -W 10 -E 'drop.*' -E 'concurrent.*' ``` -------------------------------- ### Configure Metastore Database Source: https://github.com/apache/hive/blob/master/hcatalog/src/test/e2e/templeton/README.txt Configures the database connection for the Hive metastore persistence. This example shows Derby configuration. ```xml javax.jdo.option.ConnectionURL jdbc:derby:;databaseName=/Users/ekoifman/dev/data/tmp/metastore_db_e2e;create=true Controls which DB engine metastore will use for persistence. In particular, where Derby will create it's data files. ``` -------------------------------- ### Run a Specific Metastore Benchmark Test Source: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-tools/metastore-benchmarks/README.md Execute a single, named benchmark test. Use the `-M` flag followed by the test name. Replace `metastore_db_name` and `hostname` with your specific values. ```java java -jar hmsbench.jar -d `metastore_db_name` -H `hostname` -M "TestGetValidWriteIds" ``` -------------------------------- ### Run Specific Test Group Source: https://github.com/apache/hive/blob/master/hcatalog/src/test/e2e/templeton/README.txt Filters the Ant test execution to run a specific test group, for example, 'TestHive'. ```bash -Dtests.to.run='-t TestHive' ``` -------------------------------- ### Initialize Offset Table for ETL Source: https://github.com/apache/hive/blob/master/kafka-handler/README.md Prepare an offset tracking table by initializing it with the minimum offset for each partition. This is a prerequisite for the Kafka to Hive ETL pipeline. ```sql DROP TABLE kafka_table_offsets; CREATE TABLE kafka_table_offsets ( `partition_id` INT, `max_offset` BIGINT, `insert_time` TIMESTAMP); INSERT OVERWRITE TABLE kafka_table_offsets SELECT `__partition`, MIN(`__offset`) - 1, CURRENT_TIMESTAMP FROM wiki_kafka_hive GROUP BY `__partition`, CURRENT_TIMESTAMP; ``` -------------------------------- ### Get New Aggregation Buffer Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/UDAFTemplates/VectorUDAFAvgDecimalMerge.txt Creates and returns a new instance of the aggregation buffer. This is used to initialize aggregation state. ```Java @Override public AggregationBuffer getNewAggregationBuffer() throws HiveException { return new Aggregation(); } ``` -------------------------------- ### Get Aggregation Buffer Fixed Size Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/UDAFTemplates/VectorUDAFAvgMerge.txt Calculates and returns the fixed size of the aggregation buffer, aligned to memory boundaries. ```Java @Override public long getAggregationBufferFixedSize() { JavaDataModel model = JavaDataModel.get(); return JavaDataModel.alignUp( model.object() + model.primitive2() * 2, model.memoryAlign()); } ``` -------------------------------- ### Configure ORC Storage and Optimization Settings Source: https://context7.com/apache/hive/llms.txt Tune ORC file format behavior at the session level using SET commands. Options include default compression, split handling, predicate pushdown, and caching. ```sql -- Session-level ORC tuning SET hive.exec.orc.default.compress=ZSTD; SET hive.orc.splits.include.file.footer=true; SET hive.optimize.index.filter=true; -- enable ORC predicate pushdown SET hive.orc.cache.use.soft.references=true; ``` ```sql -- Inspect ORC file metadata SET hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.PostExecOrcFileDump; SELECT * FROM events_orc LIMIT 1; ``` -------------------------------- ### HPL/SQL Function Execution Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/local/create_function4.out.txt Executes a previously defined function named GET in HPL/SQL. This snippet shows how to invoke a function. ```hplsql EXEC FUNCTION GET ``` -------------------------------- ### Launch Metastore with Custom HDFS Configuration Source: https://github.com/apache/hive/blob/master/standalone-metastore/packaging/src/docker/README.md Runs the Hive Metastore, allowing the use of a custom hdfs-site.xml. The custom configuration directory is mounted into the container, and the HIVE_CUSTOM_CONF_DIR environment variable points to its location within the container. ```shell docker run -d -p 9083:9083 --env DB_DRIVER=postgres \ -v /opt/hive/conf:/hive_custom_conf --env HIVE_CUSTOM_CONF_DIR=/hive_custom_conf \ --mount type=bind,source=`mvn help:evaluate -Dexpression=settings.localRepository -q -DforceStdout`/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar,target=/opt/hive/lib/postgres.jar \ --name metastore apache/hive:standalone-metastore-${HIVE_VERSION} ``` -------------------------------- ### HPLSQL IF with Complex Condition Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/local/if.out.txt An example of an IF statement with a more complex condition involving logical operators like AND and BETWEEN. This allows for more nuanced conditional checks. ```HPLSQL PRINT IS NOT NULL AND BETWEEN IF TRUE executed PRINT True block - Correct ``` -------------------------------- ### HmsBench Command-Line Usage Source: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-tools/metastore-benchmarks/README.md Displays the help message for the BenchmarkTool, outlining all available options and their default values. ```bash Usage: BenchmarkTool [-ChlV] [--sanitize] [--confdir=] [--params=] [--savedata=] [--separator=] [-d=] [-H=URI] [-L=] [-N=] [-o=] [-P=] [-t=] [-T=] [-W=] [-E=]... [-M=]... --confdir= configuration directory --params= number of table/partition parameters Default: 0 --sanitize sanitize results (remove outliers) --savedata= save raw data in specified dir --separator= CSV field separator Default: -C, --csv produce CSV output -d, --db= database name -E, --exclude= test name patterns to exclude -h, --help Show this help message and exit. -H, --host=URI HMS Host -l, --list list matching benchmarks -L, --spin= spin count Default: 100 -M, --pattern= test name patterns -N, --number= umber of object instances Default: 100 -o, --output= output file -P, --port= HMS Server port Default: 9083 -t, --table= table name -T, --threads= number of concurrent threads Default: 2 -V, --version Print version information and exit. -W, --warmup= warmup count Default: 15 ``` -------------------------------- ### Advanced Metastore Benchmark Configuration Source: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-tools/metastore-benchmarks/README.md Run benchmark tests with advanced options including saving raw data, sanitizing results, using a specific table name, and running with multiple partition counts. The `-N` flag can be used multiple times to specify different partition counts for tests. ```java java -jar hmsbench.jar -H `hostname` \ --savedata /tmp/benchdata \ --sanitize \ -N 100 -N 1000 \ -o bench_results.csv -C \ -d testbench \ --params=100 ``` -------------------------------- ### Get New Aggregation Buffer Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/UDAFTemplates/VectorUDAFAvgDecimal64ToDecimal.txt Provides a new instance of the AggregationBuffer for the average calculation. Requires input scale and temporary storage. ```Java @Override public AggregationBuffer getNewAggregationBuffer() throws HiveException { return new Aggregation(inputScale, temp); } ``` -------------------------------- ### Metastore Benchmark with Tab-Separated Output and Data Saving Source: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-tools/metastore-benchmarks/README.md Run benchmark tests and configure the output to be tab-separated (`--csv`). Individual data points can be saved to a specified directory (`--savedata`). ```java java -jar hmsbench.jar -d `metastore_db_name` -H `hostname` -o result.csv --csv --savedata data ``` -------------------------------- ### Get Aggregation Buffer Fixed Size Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/UDAFTemplates/VectorUDAFVar.txt Calculates and returns the fixed size of the aggregation buffer in bytes, considering memory alignment. ```Java @Override public long getAggregationBufferFixedSize() { JavaDataModel model = JavaDataModel.get(); return JavaDataModel.alignUp( model.object() + model.primitive2()*3+ model.primitive1(), model.memoryAlign()); } ``` -------------------------------- ### Execute Pig Script with Load/Store Source: https://github.com/apache/hive/blob/master/hcatalog/src/test/e2e/templeton/newtests/pigtest.txt Run a Pig script named 'loadstore.pig' that includes load and store operations. Ensure that input and output directories are correctly passed as parameters. ```bash pig -p INPDIR=:INPDIR: -p OUTDIR=:OUTDIR: loadstore.pig ``` -------------------------------- ### Query Kafka Table Source: https://github.com/apache/hive/blob/master/kafka-handler/README.md Perform standard SQL queries on the Kafka table to retrieve data. This example shows a simple select with a limit. ```sql SELECT * FROM kafka_ssl LIMIT 10; ``` -------------------------------- ### Create HPL/SQL Package Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/local/drop_package.out.txt Use CREATE PACKAGE to define the interface for a package. ```HPLSQL CREATE PACKAGE ``` -------------------------------- ### Get Aggregation Buffer Fixed Size Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMaxString.txt Calculates and returns the fixed size of the aggregation buffer in bytes. This is used for memory management and optimization. ```Java @Override public long getAggregationBufferFixedSize() { JavaDataModel model = JavaDataModel.get(); return JavaDataModel.alignUp( model.object() + model.ref()+ model.primitive1()*2, model.memoryAlign()); } ``` -------------------------------- ### Set and Print Variable in HPL/SQL Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/local/decode.out.txt Shows how to set a variable to a specific value and print it. ```hplsql SET var1 = 1 PRINT A ``` -------------------------------- ### Get Aggregation Buffer Fixed Size Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/UDAFTemplates/VectorUDAFAvgTimestamp.txt Calculates and returns the fixed size of the aggregation buffer in bytes, considering object and primitive types. ```Java @Override public long getAggregationBufferFixedSize() { JavaDataModel model = JavaDataModel.get(); return JavaDataModel.alignUp( model.object() + model.primitive2() * 2, model.memoryAlign()); } ``` -------------------------------- ### Run Benchmarks with Single Jar Source: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-tools/metastore-benchmarks/README.md Execute the HmsBench tool using its standalone jar file, providing necessary options and specifying tests. ```bash java -jar hmsbench.jar [test]... ``` -------------------------------- ### Get Current Aggregation Buffer Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/UDAFTemplates/VectorUDAFAvgTimestamp.txt Retrieves the specific Aggregation buffer instance for a given row and buffer index from the aggregation buffer sets. ```java private Aggregation getCurrentAggregationBuffer( VectorAggregationBufferRow[] aggregationBufferSets, int bufferIndex, int row) { VectorAggregationBufferRow mySet = aggregationBufferSets[row]; Aggregation myagg = (Aggregation) mySet.getAggregationBuffer(bufferIndex); return myagg; } ``` -------------------------------- ### Declare and Print Variable in HPL/SQL Source: https://github.com/apache/hive/blob/master/hplsql/src/test/results/local/decode.out.txt Demonstrates declaring an integer variable and printing its initial value. ```hplsql DECLARE var1 INT = 3 PRINT C ``` -------------------------------- ### Create Partitioned and Bucketed ORC Table with ACID Source: https://context7.com/apache/hive/llms.txt Use this for creating transactional tables with specific storage formats, partitioning, and bucketing for optimized performance and ACID compliance. Ensure 'transactional' property is set to 'true'. ```sql CREATE TABLE orders ( order_id BIGINT, customer_id INT, product STRING, amount DECIMAL(10,2), order_ts TIMESTAMP ) PARTITIONED BY (order_date STRING, region STRING) CLUSTERED BY (customer_id) INTO 32 BUCKETS STORED AS ORC TBLPROPERTIES ( 'transactional' = 'true', 'orc.compress' = 'SNAPPY', 'orc.bloom.filter.columns' = 'customer_id,product', 'orc.bloom.filter.fpp' = '0.01' ); ``` -------------------------------- ### Get Vectorized Column Parameters Source: https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/ExpressionTemplates/ColumnUnaryMinus.txt Retrieves the string representation of column parameters for vectorized expression evaluation. This is used internally by the VectorExpression framework. ```Java @Override public String vectorExpressionParameters() { return getColumnParamString(0, inputColumnNum[0]); } ```