### Build Apache Accumulo from Source with Maven Source: https://github.com/apache/accumulo/blob/main/README.md This command uses Maven to compile, test, and package the Apache Accumulo source code into a binary tar.gz archive. The resulting file will be located in 'assemble/target/accumulo--bin.tar.gz'. Users can append '-DskipTests' to the command to bypass test execution during the build process. ```Maven mvn package ``` -------------------------------- ### Apache Accumulo Standalone Cluster Configuration Properties Source: https://github.com/apache/accumulo/blob/main/TESTING.md Defines the essential properties for setting up and managing an Apache Accumulo standalone cluster. This includes general cluster parameters, paths for client and server configurations, ZooKeeper quorum details, and specific properties for managing user principals and credentials, accommodating both Kerberos-enabled and unsecure installations. ```APIDOC Configuration Properties for Standalone Clusters: accumulo.it.cluster.type: Required: Yes Description: The type of cluster is being defined. Valid Options: MINI, STANDALONE accumulo.it.cluster.clientconf: Required: Yes Description: Path to accumulo-client.properties. accumulo.it.cluster.standalone.admin.principal: Required: Yes Description: Standalone cluster principal (user) with all System permissions. accumulo.it.cluster.standalone.admin.password: Required: Yes (only valid w/o Kerberos) Description: Password for the principal. accumulo.it.cluster.standalone.admin.keytab: Required: Yes (only valid w/ Kerberos) Description: Keytab for the principal. accumulo.it.cluster.standalone.zookeepers: Required: Yes Description: ZooKeeper quorum used by the standalone cluster. accumulo.it.cluster.standalone.instance.name: Required: Yes Description: Accumulo instance name for the cluster. accumulo.it.cluster.standalone.hadoop.conf: Required: Yes Description: Hadoop configuration directory. accumulo.it.cluster.standalone.home: Required: Yes Description: Accumulo installation directory on cluster. accumulo.it.cluster.standalone.client.conf: Required: Yes Description: Accumulo conf directory on client. accumulo.it.cluster.standalone.server.conf: Required: Yes Description: Accumulo conf directory on server. accumulo.it.cluster.standalone.client.cmd.prefix: Required: No (Optional) Description: Prefix that will be added to Accumulo client commands. accumulo.it.cluster.standalone.server.cmd.prefix: Required: No (Optional) Description: Prefix that will be added to Accumulo service commands. User Credential Properties (for Kerberos or unsecure installations): Note: Each property is suffixed with an integer ($x) to group keytab/password with username. accumulo.it.cluster.standalone.users.$x: Required: Yes (when Kerberos enabled or for unsecure) Description: The principal name for user $x. accumulo.it.cluster.standalone.passwords.$x: Required: Yes (only valid w/o Kerberos) Description: The password for user $x. accumulo.it.cluster.standalone.keytabs.$x: Required: Yes (only valid w/ Kerberos) Description: The path to the keytab for user $x. Setting Properties: Command Line: -D= (e.g., -Daccumulo.it.cluster.standalone.principal=root) Properties File: Reference a file using "accumulo.it.cluster.properties". Precedence: Command line properties override file properties. ``` -------------------------------- ### Generate and export Accumulo test data using shell commands Source: https://github.com/apache/accumulo/blob/main/test/src/main/resources/v2_import_test/README.md This sequence of Accumulo shell commands demonstrates how to create a table ('tableA'), add splits for distribution, insert sample data, compact the table to write data to files, offline the table, and finally export it to a specified HDFS directory. This process prepares a dataset for potential import into another Accumulo instance. ```Accumulo Shell createtable tableA addsplits -t tableA 2 4 6 insert -t tableA 1 1 insert -t tableA 1 cf cq 1 insert -t tableA 2 cf cq 2 insert -t tableA 3 cf cq 3 insert -t tableA 4 cf cq 4 insert -t tableA 5 cf cq 5 insert -t tableA 6 cf cq 6 insert -t tableA 7 cf cq 7 compact -w -t tableA to see the current tablet files: scan -t accumulo.metadata -c file -np offline -t tableA exporttable -t tableA /accumulo/export_test ``` -------------------------------- ### Run SpotBugs Static Code Analysis with Maven Source: https://github.com/apache/accumulo/blob/main/TESTING.md Perform a thorough static code analysis using SpotBugs, including a security plugin. This command runs the `verify` phase with the `sec-bugs` profile enabled and skips regular tests, focusing solely on code quality and potential security vulnerabilities. ```bash mvn clean verify -Psec-bugs -DskipTests ``` -------------------------------- ### Run a Specific Apache Accumulo Integration Test Source: https://github.com/apache/accumulo/blob/main/TESTING.md Execute a single integration test, such as `WriteAheadLogIT`, using Maven. This command targets a specific test class while skipping SpotBugs analysis, useful for focused debugging or verification. Integration tests require significant memory and disk space (3-4GB RAM, 10GB disk). ```bash mvn clean verify -Dit.test=WriteAheadLogIT -Dtest=foo -Dspotbugs.skip ``` -------------------------------- ### Run Standalone Apache Accumulo Cluster Integration Tests Source: https://github.com/apache/accumulo/blob/main/TESTING.md Execute integration tests against a pre-configured standalone Accumulo cluster. This command specifies a properties file for the cluster and targets tests categorized as `StandaloneCapableCluster`. Before running, ensure the `accumulo-test` jar is copied to the cluster's lib folder, as not all ITs are suitable for standalone execution. ```bash mvn clean verify -Dtest=foo -Daccumulo.it.properties=/home/user/my_cluster.properties -Dfailsafe.groups=StandaloneCapableCluster -Dspotbugs.skip ``` -------------------------------- ### Run Apache Accumulo Unit Tests with Maven Source: https://github.com/apache/accumulo/blob/main/TESTING.md Execute all unit tests for Apache Accumulo by invoking the Maven `package` phase. This ensures all modules are built and resolvable, preventing issues with stale artifacts. The `maven-surefire-plugin` is used by default to run JUnit tests. ```bash mvn clean package ``` -------------------------------- ### Run MiniAccumuloCluster Integration Tests Source: https://github.com/apache/accumulo/blob/main/TESTING.md Execute integration tests that utilize MiniAccumuloCluster (MAC), a multi-process Accumulo implementation managed via Java APIs. These tests can use local filesystem or MiniDFSCluster and are run by default during the `integration-test` lifecycle phase, though they incur extra startup/shutdown time. ```bash mvn clean verify -Dspotbugs.skip ``` -------------------------------- ### Create HDFS directory for Accumulo export Source: https://github.com/apache/accumulo/blob/main/test/src/main/resources/v2_import_test/README.md This command creates a new directory in the HDFS '/accumulo' namespace, which will serve as the destination for the exported Accumulo table data. This step is a prerequisite before initiating the table export. ```HDFS Shell hadoop fs -mkdir /accumulo/export_test ``` -------------------------------- ### Run SunnyDay Apache Accumulo Integration Tests Source: https://github.com/apache/accumulo/blob/main/TESTING.md Execute the `SunnyDay` category of integration tests, which represent a minimal set designed to verify basic Accumulo functionality. These tests are typically run before submitting patches or bug fixes to quickly ensure no core functions were broken by changes. ```bash mvn clean verify -Psunny ``` -------------------------------- ### View contents of Accumulo export distcp file Source: https://github.com/apache/accumulo/blob/main/test/src/main/resources/v2_import_test/README.md This command uses the Hadoop HDFS shell to display the contents of the 'distcp.txt' file, which is generated during the Accumulo table export. This file lists the paths to the tablet files and the export metadata zip file that were part of the exported dataset. ```HDFS Shell hadoop fs -cat /accumulo/export_test/distcp.txt ``` -------------------------------- ### Mutation Version 2 Serialization Format Source: https://github.com/apache/accumulo/blob/main/core/src/main/java/org/apache/accumulo/core/data/doc-files/mutation-serialization.html Describes the byte-level layout for the Version 2 serialization format of Apache Accumulo's `Mutation` data. This format introduces a control byte, uses variable-length encoding for integers and longs, and details the main data layout and the 'data' block layout for each entry, including fields like row ID, data length, column families, qualifiers, visibilities, timestamps, and values. ```APIDOC Mutation Version 2 Data Serialization Layout: byte 0: control byte (top bit = 1 for version 2; bottom bit = values present flag) next integer: length of row ID next _n_ bytes: row ID next integer: data length next _n_ bytes: data (see "Version 2 Data Block Layout") next integer: number of entries next integer: number of values (if values present flag is set) next _n_ sets of integers and byte arrays: _n_ value lengths and value data bytes (if values present flag is set) (Note: variable length encoding for integers and longs) Version 2 "data" Block Layout for Each Entry: first long and byte array: column family length and bytes next long and byte array: column qualifier length and bytes next long and byte array: column visibility length and bytes next boolean: has timestamp flag next long: timestamp (only present if timestamp flag is set) next boolean: deleted flag next long: value length (if negative, value bytes are same as already-read value (-length - 1)) next _n_ bytes: value bytes (if value length is non-negative) ``` -------------------------------- ### Mutation Version 1 Serialization Format Source: https://github.com/apache/accumulo/blob/main/core/src/main/java/org/apache/accumulo/core/data/doc-files/mutation-serialization.html Describes the byte-level layout for the Version 1 serialization format of Apache Accumulo's `Mutation` data. This format includes a main data layout and a specific 'data' block layout for each entry, detailing fields like row ID, data length, column families, qualifiers, visibilities, timestamps, and values. ```APIDOC Mutation Version 1 Data Serialization Layout: bytes 0-3: 4-byte integer (length of row ID) next _n_ bytes: row ID next integer: data length next _n_ bytes: data (see "Version 1 Data Block Layout") next integer: number of entries next boolean: values present flag next integer: number of values (if values present flag is set) next _n_ sets of integers and byte arrays: _n_ value lengths and value data bytes (if values present flag is set) Version 1 "data" Block Layout for Each Entry: first integer and byte array: column family length and bytes next integer and byte array: column qualifier length and bytes next integer and byte array: column visibility length and bytes next boolean: has timestamp flag next long: timestamp next boolean: deleted flag next integer: value length (if negative, value bytes are same as already-read value (-length - 1)) next _n_ bytes: value bytes (if value length is non-negative) ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.