### Basic HappyBase Usage Example Source: https://github.com/python-happybase/happybase/blob/master/doc/index.md Illustrates establishing a connection, interacting with a table, and performing basic data operations like put, get, and delete. ```python import happybase connection = happybase.Connection('hostname') table = connection.table('table-name') table.put(b'row-key', {b'family:qual1': b'value1', b'family:qual2': b'value2'}) row = table.row(b'row-key') print(row[b'family:qual1']) # prints 'value1' for key, data in table.rows([b'row-key-1', b'row-key-2']): print(key, data) # prints row key and data for each row for key, data in table.scan(row_prefix=b'row'): print(key, data) # prints 'value1' and 'value2' row = table.delete(b'row-key') ``` -------------------------------- ### Install HappyBase using Pip Source: https://github.com/python-happybase/happybase/blob/master/doc/installation.md Install the HappyBase package and its Thrift dependencies from the Python Package Index (PyPI) using pip within an activated virtual environment. ```sh (envname) $ pip install happybase ``` -------------------------------- ### Set Up Development Environment Source: https://github.com/python-happybase/happybase/blob/master/doc/development.md Follow these steps to install dependencies and set up HappyBase in editable mode within a virtual environment. ```sh cd /path/to/happybase/ mkvirtualenv happybase pip install -r test-requirements.txt pip install -e . ``` -------------------------------- ### Test HappyBase Installation Source: https://github.com/python-happybase/happybase/blob/master/doc/installation.md Verify that HappyBase has been installed correctly by attempting to import it in a Python interpreter. No errors indicate a successful installation. ```python python -c 'import happybase' ``` -------------------------------- ### Connect and Mutate Row with HBase Thrift API Source: https://github.com/python-happybase/happybase/blob/master/doc/faq.md This example demonstrates the verbose code required to connect to HBase and store two values directly using the Thrift API. It involves multiple imports and setup steps for transport and protocol. ```python from thrift import Thrift from thrift.transport import TSocket, TTransport from thrift.protocol import TBinaryProtocol from hbase import ttypes from hbase.Hbase import Client, Mutation sock = TSocket.TSocket('hostname', 9090) transport = TTransport.TBufferedTransport(sock) protocol = TBinaryProtocol.TBinaryProtocol(transport) client = Client(protocol) transport.open() mutations = [Mutation(column='family:qual1', value='value1'), Mutation(column='family:qual2', value='value2')] client.mutateRow('table-name', 'row-key', mutations) ``` -------------------------------- ### Connect and Put Data with HappyBase Source: https://github.com/python-happybase/happybase/blob/master/doc/faq.md This example shows the simplified code for connecting to HBase and storing two values using HappyBase. It abstracts away the complexities of the Thrift API. ```python import happybase connection = happybase.Connection('hostname') table = connection.table('table-name') table.put('row-key', {'family:qual1': 'value1', 'family:qual2': 'value2'}) ``` -------------------------------- ### Scan Rows with a Start Key Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Iterate over rows starting from a specified row key up to the end of the table. ```python for key, data in table.scan(row_start=b'aaa'): print(key, data) ``` -------------------------------- ### Clone HappyBase Repository Source: https://github.com/python-happybase/happybase/blob/master/doc/development.md Use this command to get a copy of the HappyBase source code from GitHub. ```sh $ git clone https://github.com/wbolster/happybase.git ``` -------------------------------- ### Scan Rows by Prefix Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Iterate over rows that start with a specific prefix. ```python for key, data in table.scan(row_prefix=b'abc'): print(key, data) ``` -------------------------------- ### Obtain Table Instance Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Get a Table instance for interacting with a specific HBase table. This does not create a round-trip to the server and assumes the table exists. ```python table = connection.table('mytable') ``` -------------------------------- ### Retrieve Table Column Families and Regions in Python Source: https://context7.com/python-happybase/happybase/llms.txt Use `table.families()` to get column family settings and `table.regions()` to retrieve region information including server and key ranges. Ensure a connection to HBase is established first. ```python import happybase connection = happybase.Connection('hbase-host') table = connection.table('users') # Retrieve column family settings families = table.families() print(families) # => { # b'info': {'maxVersions': 5, 'compression': 'NONE', 'blockCacheEnabled': True, ...}, # b'events': {'maxVersions': 1, 'blockCacheEnabled': False, ...}, # } # Retrieve region information (server + start/stop keys) regions = table.regions() for region in regions: print(region['name'], region['startKey'], region['endKey']) ``` -------------------------------- ### Obtain a Table Handle in HappyBase Source: https://context7.com/python-happybase/happybase/llms.txt Get a `Table` instance for interacting with a specific HBase table. Supports prefix bypass for cross-namespace access. ```python import happybase connection = happybase.Connection('hbase-host', table_prefix='myapp') # Gets a handle to "myapp_users" in HBase, referred to simply as "users" table = connection.table('users') print(table) # => # Bypass the prefix to access a table in a different namespace other_table = connection.table('partner_data', use_prefix=False) ``` -------------------------------- ### Retrieve Multiple Versions with Timestamps Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Get multiple versions of a cell along with their timestamps by setting include_timestamp=True. ```python cells = table.cells(b'row-key', b'cf1:col1', versions=3, include_timestamp=True) for value, timestamp in cells: print("Cell data at {}: {}".format(timestamp, value)) ``` -------------------------------- ### Table.regions() Source: https://context7.com/python-happybase/happybase/llms.txt Retrieves the region distribution for a table. This provides information about each region, including its name, start key, and end key. ```APIDOC ## `Table.regions()` ### Description Retrieves the region distribution for a table, showing server and key ranges for each region. ### Method `Table.regions()` ### Parameters None ### Response - **regions** (list) - A list of dictionaries, where each dictionary represents a region and contains keys like 'name', 'startKey', and 'endKey'. ``` -------------------------------- ### Scan Rows within a Range Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Iterate over rows between a start key (inclusive) and a stop key (exclusive). ```python for key, data in table.scan(row_start=b'aaa', row_stop=b'xyz'): print(key, data) ``` -------------------------------- ### Get and Set Atomic Counter Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Retrieve or set a counter value directly using 'counter_get' and 'counter_set'. Avoid manual get-modify-set operations; use atomic increment/decrement instead. ```python print(table.counter_get(b'row-key', b'cf1:counter')) # prints 5 table.counter_set(b'row-key', b'cf1:counter', 12) ``` -------------------------------- ### Atomic Counter Operations Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Details the methods for atomically incrementing, decrementing, getting, and setting counter values associated with a row key and column. ```APIDOC ## Atomic Counter Operations ### Description HappyBase provides atomic operations for managing 8-byte wide counters stored as big-endian signed integers in HBase. Counters are initialized to 0 on first use. The `counter_inc()` and `counter_dec()` methods return the value after the operation. Direct `counter_get()` and `counter_set()` are also available, but should not be used for modifying values that should be atomic. ### Methods - `table.counter_inc(row, column, value=1)` - `table.counter_dec(row, column, value=1)` - `table.counter_get(row, column)` - `table.counter_set(row, column, value)` ### Parameters #### `counter_inc`/`counter_dec` Parameters - **row** (bytes) - Required - The row key. - **column** (bytes) - Required - The column qualifier (e.g., b'cf:counter'). - **value** (int) - Optional - The amount to increment or decrement by (defaults to 1). #### `counter_get`/`counter_set` Parameters - **row** (bytes) - Required - The row key. - **column** (bytes) - Required - The column qualifier (e.g., b'cf:counter'). - **value** (int) - Required for `counter_set` - The value to set the counter to. ### Request Example ```python # Incrementing and decrementing print(table.counter_inc(b'row-key', b'cf1:counter')) # Output: 1 print(table.counter_inc(b'row-key', b'cf1:counter')) # Output: 2 print(table.counter_inc(b'row-key', b'cf1:counter', value=3)) # Output: 5 print(table.counter_dec(b'row-key', b'cf1:counter')) # Output: 4 # Getting and setting directly (use with caution) print(table.counter_get(b'row-key', b'cf1:counter')) # Output: 4 table.counter_set(b'row-key', b'cf1:counter', 10) print(table.counter_get(b'row-key', b'cf1:counter')) # Output: 10 ``` ### Response - **`counter_inc`/`counter_dec`**: Returns the integer value of the counter after the operation. - **`counter_get`**: Returns the current integer value of the counter. - **`counter_set`**: Returns `None`. ``` -------------------------------- ### Instantiate Connection Pool Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Create a connection pool with a specified size and pass arguments to the underlying `Connection` constructor. The pool establishes one connection immediately and opens others lazily. ```default pool = happybase.ConnectionPool(size=3, host='...', table_prefix='myproject') ``` -------------------------------- ### Run HappyBase Tests Source: https://github.com/python-happybase/happybase/blob/master/doc/development.md Execute the test suite using the `make test` command. Test coverage reports are generated in coverage/index.html. ```sh make test ``` -------------------------------- ### Create and Activate Virtual Environment Source: https://github.com/python-happybase/happybase/blob/master/doc/installation.md Use virtualenv to set up and activate a new Python virtual environment for isolating project dependencies. ```sh virtualenv envname source envname/bin/activate ``` ```sh mkvirtualenv envname ``` -------------------------------- ### Connect to HBase Thrift Server Source: https://context7.com/python-happybase/happybase/llms.txt Establish a connection to the HBase Thrift server using various configuration options. Manual connect/disconnect is also supported. ```python import happybase # Basic connection (auto-connects on construction) connection = happybase.Connection('hbase-host') # Full options: custom port, timeout, HBase compat level, transport/protocol connection = happybase.Connection( host='hbase-host', port=9090, timeout=5000, # socket timeout in ms autoconnect=True, table_prefix='myapp', # prepends "myapp_" to all table names table_prefix_separator=b'_', compat='0.98', # HBase compatibility mode transport='buffered', # 'buffered' or 'framed' protocol='binary', # 'binary' or 'compact' ) # Manual connect/disconnect connection2 = happybase.Connection('hbase-host', autoconnect=False) connection2.open() # ... use connection2 ... connection2.close() # List tables visible under the current prefix print(connection.tables()) # => [b'users', b'events', b'counters'] ``` -------------------------------- ### Create an HBase Table Source: https://context7.com/python-happybase/happybase/llms.txt Create a new HBase table with specified column families and their configurations, such as max versions, TTL, and compression. Verify creation and inspect families. ```python import happybase connection = happybase.Connection('hbase-host') connection.create_table( 'users', { 'info': dict(max_versions=5), # keep 5 versions 'events': dict(max_versions=1, block_cache_enabled=False), 'meta': dict(time_to_live=86400, compression='SNAPPY'), # TTL in seconds 'cf_defaults': dict(), # use HBase defaults } ) # Verify the table was created print(connection.tables()) # => [b'users'] # Inspect the column families table = connection.table('users') print(table.families()) # => {b'info': {'maxVersions': 5, ...}, b'events': {...}, ...} ``` -------------------------------- ### Obtain Connection using Context Manager Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Acquire a connection from the pool using a `with` statement to ensure it is returned automatically. Avoid using the connection object after the `with` block exits. ```default pool = happybase.ConnectionPool(size=3, host='...') with pool.connection() as connection: print(connection.tables()) ``` -------------------------------- ### Batch Operations with Timestamp Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Demonstrates how to use the Batch class to perform multiple mutations with a single timestamp. The Batch instance can be used as a context manager. ```APIDOC ## Batch Operations with Timestamp ### Description Use the `Batch` class to group multiple mutations (puts and deletes) and apply them with a single timestamp. This can be managed using Python's `with` statement for automatic sending upon block exit. ### Method `table.batch(timestamp=None)` ### Parameters #### Query Parameters - **timestamp** (int) - Optional - A single timestamp to apply to all mutations in the batch. ### Request Example ```python # Using a specific timestamp b = table.batch(timestamp=123456789) b.put(b'row-key', {b'cf:col': b'value'}) b.delete(b'row-key') b.send() # Using a context manager (timestamp applied automatically) with table.batch() as b: b.put(b'row-key-1', {b'cf:col1': b'value1'}) b.delete(b'row-key-2') ``` ### Response No direct response is documented for the batch creation itself, but mutations are applied to HBase. ``` -------------------------------- ### Establish HBase Connection Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Connect to an HBase cluster using the default Thrift server. The connection is automatically opened upon instantiation. ```python import happybase connection = happybase.Connection('somehost') ``` -------------------------------- ### Connection.create_table() Source: https://context7.com/python-happybase/happybase/llms.txt Creates a new HBase table with specified column families and their configurations. ```APIDOC ## Connection.create_table() ### Description Creates a new HBase table with one or more column families and their configuration options. ### Method `create_table(name, families=None, **kwargs)` ### Parameters - **name** (str): The name of the table to create. - **families** (dict, optional): A dictionary where keys are column family names and values are dictionaries of column family configurations (e.g., `max_versions`, `time_to_live`, `compression`). - **kwargs**: Additional keyword arguments for table creation. ### Request Example ```python import happybase connection = happybase.Connection('hbase-host') connection.create_table( 'users', { 'info': dict(max_versions=5), 'events': dict(max_versions=1, block_cache_enabled=False), 'meta': dict(time_to_live=86400, compression='SNAPPY'), } ) ``` ### Response - **Success**: Table is created in HBase. - **Error**: Raises an exception if the table already exists or if there's an issue during creation. ``` -------------------------------- ### Batch Operations using Context Manager Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Simplify batch operations using Python's 'with' statement for automatic sending upon block termination, even with errors. ```python with table.batch() as b: b.put(b'row-key-1', {b'cf:col1': b'value1', b'cf:col2': b'value2'}) b.put(b'row-key-2', {b'cf:col2': b'value2', b'cf:col3': b'value3'}) b.put(b'row-key-3', {b'cf:col3': b'value3', b'cf:col4': b'value4'}) b.delete(b'row-key-4') ``` -------------------------------- ### Get current value of a counter Source: https://context7.com/python-happybase/happybase/llms.txt Retrieve the current value of a counter column without modifying it using `counter_get()`. This is useful for monitoring or conditional logic. ```python print(table.counter_get(b'page:/home', b'hits:count')) # => 7 ``` -------------------------------- ### Connection Class Source: https://context7.com/python-happybase/happybase/llms.txt The `Connection` class is the primary entry point for establishing a session with the HBase Thrift server. It allows for configuration of host, port, timeouts, and transport/protocol settings. ```APIDOC ## Connection ### Description Establishes a connection to the HBase Thrift server and provides methods for table administration. ### Usage ```python import happybase # Basic connection connection = happybase.Connection('hbase-host') # Full options connection = happybase.Connection( host='hbase-host', port=9090, timeout=5000, autoconnect=True, table_prefix='myapp', table_prefix_separator=b'_', compat='0.98', transport='buffered', protocol='binary' ) # Manual connect/disconnect connection2 = happybase.Connection('hbase-host', autoconnect=False) connection2.open() connection2.close() # List tables print(connection.tables()) ``` ``` -------------------------------- ### Configure Test Environment Variables Source: https://github.com/python-happybase/happybase/blob/master/doc/development.md Set these environment variables before running tests if your Thrift server is not on localhost or if you need to specify a different port. ```sh export HAPPYBASE_HOST=host.example.org export HAPPYBASE_PORT=9091 ``` -------------------------------- ### Connection.table() Source: https://context7.com/python-happybase/happybase/llms.txt Obtains a `Table` instance for interacting with a specific HBase table. ```APIDOC ## Connection.table() ### Description Returns a `Table` instance for interacting with a specific HBase table. Supports optional prefix bypass for cross-namespace access. ### Method `table(name, use_prefix=True)` ### Parameters - **name** (str): The name of the table. - **use_prefix** (bool, optional): If True, the `table_prefix` configured in the `Connection` is prepended to the table name. Defaults to True. ### Request Example ```python import happybase connection = happybase.Connection('hbase-host', table_prefix='myapp') # Gets a handle to "myapp_users" in HBase table = connection.table('users') print(table) # Bypass prefix to access a table in a different namespace other_table = connection.table('partner_data', use_prefix=False) ``` ### Response - **Success**: Returns a `happybase.table.Table` object. - **Error**: Raises an exception if the table does not exist (depending on HBase configuration and HappyBase version). ``` -------------------------------- ### Batch Operations with Batch Size Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Explains how to use the `batch_size` argument to automatically send mutations in chunks when a certain number of operations are pending, useful for large datasets. ```APIDOC ## Batch Operations with Batch Size ### Description Specify `batch_size` when creating a `Batch` instance to automatically send pending mutations to the server once the number of operations exceeds this threshold. This is useful for handling large datasets without consuming excessive memory or making excessively large round-trips. ### Method `table.batch(batch_size=None)` ### Parameters #### Query Parameters - **batch_size** (int) - Optional - The maximum number of pending operations before automatically sending the batch. ### Request Example ```python # Automatically sends batches of 1000 operations with table.batch(batch_size=1000) as b: for i in range(1200): b.put(b'row-%04d' % i, { b'cf1:col1': b'v1', b'cf1:col2': b'v2', }) # This example results in two batches of 1000 operations and one final batch of 200. ``` ### Response No direct response is documented for the batch creation itself, but mutations are applied to HBase in chunks as the `batch_size` is reached. ``` -------------------------------- ### Configure Connection with Table Prefix Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Use the `table_prefix` argument in the `Connection` constructor to automatically prepend a namespace to all table names. This simplifies managing tables across different applications sharing a single HBase instance. ```python connection = happybase.Connection('somehost', table_prefix='myproject') ``` -------------------------------- ### Minimize Work within Connection Context Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Keep operations within the `with` block to an absolute minimum, typically only for data loading. Process the loaded data outside the `with` block to return the connection to the pool quickly. ```default with pool.connection() as connection: table = connection.table('table-name') row = table.row(b'row-key') process_data(row) ``` -------------------------------- ### Table.families() Source: https://context7.com/python-happybase/happybase/llms.txt Retrieves the column family configuration for a table. This includes settings like maxVersions, compression, and blockCacheEnabled for each family. ```APIDOC ## `Table.families()` ### Description Retrieves the column family configuration for a table. ### Method `Table.families()` ### Parameters None ### Response - **families** (dict) - A dictionary where keys are column family names (bytes) and values are dictionaries of family configurations. ``` -------------------------------- ### Basic batch mutation with explicit send Source: https://context7.com/python-happybase/happybase/llms.txt Aggregate multiple puts and deletes into a single Thrift round-trip using `table.batch()`. Explicitly call `send()` to execute the mutations. ```python import happybase connection = happybase.Connection('hbase-host') table = connection.table('users') b = table.batch() b.put(b'user-010', {b'info:name': b'Carol', b'info:email': b'carol@example.com'}) b.put(b'user-011', {b'info:name': b'Dave', b'info:email': b'dave@example.com'}) b.delete(b'user-old') b.send() ``` -------------------------------- ### Table.batch() / Batch Source: https://context7.com/python-happybase/happybase/llms.txt Aggregates multiple puts and deletes into a single Thrift round-trip. Supports context manager use, transactional mode, and automatic flushing. ```APIDOC ## Table.batch() / Batch ### Description Aggregates multiple puts and deletes into a single Thrift round-trip. Supports context manager use, transactional mode (rollback on exception), and automatic flushing at a configurable `batch_size`. ### Method `batch(batch_size=None, transaction=False, timestamp=None)` ### Parameters #### Path Parameters - **batch_size** (int, optional) - If set, the batch will automatically flush every `batch_size` mutations. - **transaction** (bool, optional) - If True, the batch will only be sent if no exception is raised within the `with` block. - **timestamp** (int, optional) - If set, all mutations in the batch will be associated with this timestamp. ### Batch Methods - `put(key, mapping)`: Adds a put operation to the batch. - `delete(key, columns=None)`: Adds a delete operation to the batch. - `send()`: Explicitly sends the mutations in the batch. ``` -------------------------------- ### Test HBase 0.90 Compatibility Source: https://github.com/python-happybase/happybase/blob/master/doc/development.md To test against HBase 0.90 compatibility mode, set the HAPPYBASE_COMPAT environment variable. ```sh export HAPPYBASE_COMPAT=0.90 ``` -------------------------------- ### Create HBase Table with Column Families Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Create a new table in HBase with specified column families and their configurations. Use the Connection object to perform this operation. ```python connection.create_table( 'mytable', {'cf1': dict(max_versions=10), 'cf2': dict(max_versions=1, block_cache_enabled=False), 'cf3': dict(), # use defaults } ) ``` -------------------------------- ### Fetch Multiple Rows with HappyBase Source: https://context7.com/python-happybase/happybase/llms.txt Efficiently retrieve a batch of rows by their keys in a single Thrift call. Results are returned as (key, data) tuples. Can be converted to dictionaries or OrderedDicts to preserve order. ```python import happybase from collections import OrderedDict connection = happybase.Connection('hbase-host') table = connection.table('users') # Fetch multiple rows at once rows = table.rows([b'user-001', b'user-002', b'user-003']) for key, data in rows: print(key, data[b'info:name']) # Convert to a plain dict (order not preserved) rows_dict = dict(table.rows([b'user-001', b'user-002'])) # Convert to OrderedDict to preserve server-returned order rows_ordered = OrderedDict(table.rows([b'user-001', b'user-002'])) # With column filter and timestamps rows = table.rows( [b'user-001', b'user-002'], columns=[b'info:name', b'info:email'], include_timestamp=True, ) for key, data in rows: name_val, name_ts = data[b'info:name'] print(f"{key}: {name_val} @ {name_ts}") ``` -------------------------------- ### Table.rows() Source: https://context7.com/python-happybase/happybase/llms.txt Fetches a batch of rows by key in a single Thrift call. Supports column filtering and including timestamps. ```APIDOC ## `Table.rows()` — Retrieve Multiple Rows Fetches a batch of rows by key in a single Thrift call, returning `(key, data)` tuples. Supports column filtering and including timestamps. ### Parameters - **keys** (list of bytes): A list of row keys to fetch. - **columns** (list of bytes, optional): A list of column names (e.g., b'family:qualifier') to retrieve for each row. - **include_timestamp** (bool, optional): If True, include the timestamp with each value. The return format for values will be a tuple `(value, timestamp)`. ### Return Value A list of `(key, data)` tuples, where `data` is a dictionary of column values for that row. If `include_timestamp` is True, the values in the `data` dictionary are tuples of `(value, timestamp)`. ### Example ```python # Fetch multiple rows at once rows = table.rows([b'user-001', b'user-002', b'user-003']) for key, data in rows: print(key, data[b'info:name']) # Convert to a plain dict (order not preserved) rows_dict = dict(table.rows([b'user-001', b'user-002'])) # Convert to OrderedDict to preserve server-returned order rows_ordered = OrderedDict(table.rows([b'user-001', b'user-002'])) # With column filter and timestamps rows = table.rows( [b'user-001', b'user-002'], columns=[b'info:name', b'info:email'], include_timestamp=True, ) for key, data in rows: name_val, name_ts = data[b'info:name'] print(f"{key}: {name_val} @ {name_ts}") ``` ``` -------------------------------- ### Connection.compact_table() Source: https://context7.com/python-happybase/happybase/llms.txt Triggers a compaction for a given HBase table. ```APIDOC ## Connection.compact_table() ### Description Triggers a compaction for a given HBase table. Compaction is an internal HBase process to merge data files. ### Method `compact_table(name, major=False, **kwargs)` ### Parameters - **name** (str): The name of the table to compact. - **major** (bool, optional): If True, performs a major compaction; otherwise, performs a minor compaction. Defaults to False. - **kwargs**: Additional keyword arguments for compaction. ### Request Example ```python import happybase connection = happybase.Connection('hbase-host') # Minor compaction connection.compact_table('users') # Major compaction connection.compact_table('users', major=True) ``` ### Response - **Success**: Compaction process is initiated. - **Error**: Raises an exception if the table does not exist or if compaction fails. ``` -------------------------------- ### Table.counter_inc() / counter_dec() / counter_get() / counter_set() Source: https://context7.com/python-happybase/happybase/llms.txt Provides atomic, server-side increment and decrement operations on 64-bit signed integer counter columns. Counters are auto-initialized to 0 on first use. ```APIDOC ## Table.counter_inc() / counter_dec() / counter_get() / counter_set() ### Description Provides atomic, server-side increment and decrement operations on 64-bit signed integer counter columns. Counters are auto-initialized to 0 on first use. ### Methods - `counter_inc(key, column, value=1)`: Atomically increments the counter by `value` (default is 1) and returns the new value. - `counter_dec(key, column, value=1)`: Atomically decrements the counter by `value` (default is 1) and returns the new value. - `counter_get(key, column)`: Reads the current value of the counter without modifying it. - `counter_set(key, column, value)`: Directly sets the counter to a specific `value`. Use with caution in concurrent contexts. ``` -------------------------------- ### Establish HBase Connection Manually Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Disable automatic connection opening and manually open the connection to HBase. This is useful for specific connection configurations. ```python connection = happybase.Connection('somehost', autoconnect=False) # before first use: connection.open() ``` -------------------------------- ### Batch Operations with Size Limit Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Handle large datasets by specifying 'batch_size' to automatically send mutations when a threshold is reached, preventing memory issues and large round-trips. ```python with table.batch(batch_size=1000) as b: for i in range(1200): # this put() will result in two mutations (two cells) b.put(b'row-%04d' % i, { b'cf1:col1': b'v1', b'cf1:col2': b'v2', }) ``` -------------------------------- ### Batch mutations using context manager Source: https://context7.com/python-happybase/happybase/llms.txt Utilize `table.batch()` as a context manager for automatic sending of mutations upon exiting the `with` block. This ensures all operations are sent, even if errors occur. ```python with table.batch() as b: for i in range(100): b.put(b'row-%04d' % i, {b'cf:val': b'data-%d' % i}) ``` -------------------------------- ### Connection.enable_table() Source: https://context7.com/python-happybase/happybase/llms.txt Enables a disabled HBase table. ```APIDOC ## Connection.enable_table() ### Description Enables a disabled HBase table, making it available for read and write operations. ### Method `enable_table(name, **kwargs)` ### Parameters - **name** (str): The name of the table to enable. - **kwargs**: Additional keyword arguments for enabling the table. ### Request Example ```python import happybase connection = happybase.Connection('hbase-host') connection.enable_table('archive') ``` ### Response - **Success**: Table is enabled. - **Error**: Raises an exception if the table does not exist or is already enabled. ``` -------------------------------- ### Transactional Batch Operations Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Shows how to use the `transaction=True` argument with `Table.batch()` to ensure that all mutations are applied only if no exceptions occur within the `with` block. ```APIDOC ## Transactional Batch Operations ### Description When `transaction=True` is set for a `Batch` instance used as a context manager, the batch will only be sent to HBase if the `with` block completes without raising any exceptions. If an exception occurs, no mutations are sent. ### Method `table.batch(transaction=True)` ### Parameters #### Query Parameters - **transaction** (bool) - Required - If `True`, enables transactional behavior for the batch. ### Request Example ```python try: with table.batch(transaction=True) as b: b.put(b'row-key-1', {b'cf:col1': b'value1'}) # Simulate an error raise ValueError("Something went wrong!") except ValueError: # Error handling: no data is sent to HBase pass # If no error occurred, the transaction would succeed and data would be sent. ``` ### Response No direct response is documented for the batch creation itself, but mutations are applied to HBase upon successful transaction completion. ``` -------------------------------- ### Batch Operations with Timestamp Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Use Batch for multiple mutations with an optional timestamp for the entire batch. Storing and deleting data for the same row key in a single batch leads to unpredictable results. ```python b = table.batch(timestamp=123456789) b.put(...) b.delete(...) b.send() ``` -------------------------------- ### Manage HBase Table Lifecycle Source: https://context7.com/python-happybase/happybase/llms.txt Control the state of HBase tables, including checking if enabled, disabling, deleting, and re-enabling. Compaction can also be triggered. ```python import happybase connection = happybase.Connection('hbase-host') # Check if enabled print(connection.is_table_enabled('users')) # => True # Disable, then delete manually connection.disable_table('users') connection.delete_table('users') # Or use disable=True shortcut connection.delete_table('old_table', disable=True) # Re-enable a disabled table connection.enable_table('archive') # Trigger compaction connection.compact_table('users') # minor compaction connection.compact_table('users', major=True) # major compaction ``` -------------------------------- ### List Available HBase Tables Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Retrieve a list of all tables available in the connected HBase cluster. This method is part of the Connection API. ```python print(connection.tables()) ``` -------------------------------- ### Scan All Rows in a Table Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Iterate over all rows in a table using a basic scan. Be cautious as full table scans can be expensive. ```python for key, data in table.scan(): print(key, data) ``` -------------------------------- ### Retrieve Specific Columns from a Row Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Specify exact columns to retrieve from a row for performance. Use a list of column names (family:qualifier). ```python row = table.row(b'row-key', columns=[b'cf1:col1', b'cf1:col2']) print(row[b'cf1:col1']) print(row[b'cf1:col2']) ``` -------------------------------- ### Batch mutations with timestamp Source: https://context7.com/python-happybase/happybase/llms.txt Apply a specific timestamp to all mutations within a batch by providing the `timestamp` argument to `table.batch()`. This allows for versioning of data. ```python with table.batch(timestamp=1700000000000) as b: b.put(b'user-030', {b'info:name': b'Eve'}) b.delete(b'user-029', columns=[b'info:temp']) ``` -------------------------------- ### Table.row() Source: https://context7.com/python-happybase/happybase/llms.txt Fetches all columns for a single row key from an HBase table. ```APIDOC ## Table.row() ### Description Fetches all columns for a single row key from an HBase table. Optionally filters by timestamp and returns `(value, timestamp)` tuples. ### Method `row(key, columns=None, timestamp=None, include_timestamp=False)` ### Parameters - **key** (bytes): The row key to retrieve. - **columns** (list of bytes, optional): A list of column names (family:qualifier) to fetch. If None, all columns are fetched. - **timestamp** (int, optional): Retrieves data from a specific timestamp. - **include_timestamp** (bool, optional): If True, returns `(value, timestamp)` tuples for each column. Defaults to False. ### Request Example ```python import happybase connection = happybase.Connection('hbase-host') table = connection.table('users') # Fetch all columns for a row row = table.row(b'user-001') print(row) # Fetch specific columns user_info = table.row(b'user-001', columns=[b'info:name', b'events:login']) # Fetch with timestamp and include timestamp user_data_with_ts = table.row(b'user-001', include_timestamp=True) ``` ### Response - **Success**: Returns a dictionary where keys are column names (bytes) and values are the cell values (bytes). If `include_timestamp` is True, values are `(value, timestamp)` tuples. - **Error**: Raises an exception if the row key does not exist or if there's an issue accessing the table. ``` -------------------------------- ### List Tables with Table Prefix Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md When a `table_prefix` is set, `Connection.tables()` only returns tables matching the prefix and removes the prefix from the results. This ensures you only see tables relevant to your application's namespace. ```python print(connection.tables()) # Table "myproject_XYZ" in HBase will be # returned as simply "XYZ" ``` -------------------------------- ### Auto-flushing batch mutations Source: https://context7.com/python-happybase/happybase/llms.txt Configure `table.batch()` with `batch_size` to automatically flush mutations when the specified size is reached. This is useful for very large numbers of mutations. ```python with table.batch(batch_size=1000) as b: for i in range(5000): b.put(b'bulk-%06d' % i, { b'data:col1': b'value1', b'data:col2': b'value2', }) # => Results in 5 automatic flushes of 1000 mutations each (2 cols per put = 2 mutations) ``` -------------------------------- ### ConnectionPool Source: https://context7.com/python-happybase/happybase/llms.txt Manages a fixed pool of Connection objects for use in multi-threaded applications. Connections are checked out via a context manager and automatically returned. ```APIDOC ## ConnectionPool ### Description Manages a fixed pool of `Connection` objects for use in multi-threaded applications. Connections are checked out via a context manager and automatically returned (and replaced if broken) after use. ### Initialization `ConnectionPool(size=1, host='localhost', port=9090, timeout=None, autoconnect=True, table_prefix=None, compat='0.98', transport='buffered', protocol='binary', **connection_kwargs)` ### Methods - `connection(timeout=None)`: Returns a connection from the pool using a context manager. Raises `NoConnectionsAvailable` if no connection is available within the specified `timeout`. ``` -------------------------------- ### Nested Connection Acquisition Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md A thread can hold multiple connections from the pool; requesting a connection when one is already held will return the same instance. This simplifies nested calls that require database access. ```default pool = happybase.ConnectionPool(size=3, host='...') def do_something_else(): with pool.connection() as connection: pass # use the connection here with pool.connection() as connection: # use the connection here, e.g. print(connection.tables()) # call another function that uses a connection do_something_else() ``` -------------------------------- ### Store Data with HappyBase Source: https://context7.com/python-happybase/happybase/llms.txt Write one or more column values for a specific row. Supports explicit timestamps and disabling the Write-Ahead Log (WAL) for performance at the risk of data loss. Suitable for storing binary data. ```python import happybase import struct connection = happybase.Connection('hbase-host') table = connection.table('users') # Write multiple columns for a row table.put(b'user-001', { b'info:name': b'Alice', b'info:email': b'alice@example.com', b'info:age': b'30', }) # Write with an explicit timestamp (milliseconds) table.put(b'user-001', {b'info:name': b'Alice Updated'}, timestamp=1700123456789) # Disable Write-Ahead Log for higher throughput (risk: data loss on crash) table.put(b'user-002', {b'info:name': b'Bob'}, wal=False) # Store binary data (e.g., a packed integer) table.put(b'metrics-001', { b'data:count': struct.pack('>q', 42), }) ``` -------------------------------- ### Table.row() Source: https://context7.com/python-happybase/happybase/llms.txt Fetches a single row by its key. Supports fetching specific columns, filtering by timestamp, and including timestamps in the result. ```APIDOC ## `Table.row()` - Fetch a Single Row Fetches a single row by its key. Supports fetching specific columns, filtering by timestamp, and including timestamps in the result. ### Parameters - **key** (bytes): The row key to fetch. - **columns** (list of bytes, optional): A list of column names (e.g., b'family:qualifier') to retrieve. If not specified, all columns for the row are returned. - **timestamp** (int, optional): Fetch data at or before this timestamp (in milliseconds). - **include_timestamp** (bool, optional): If True, include the timestamp with each value. The return format for values will be a tuple `(value, timestamp)`. ### Return Value A dictionary where keys are column names and values are the corresponding cell data. If `include_timestamp` is True, values are tuples of `(value, timestamp)`. ### Example ```python # Fetch specific columns only row = table.row(b'user-001', columns=[b'info:name', b'info:email']) print(row[b'info:name']) # => b'Alice' # Fetch an entire column family row = table.row(b'user-001', columns=[b'info']) # Restrict to data at or before a given timestamp row = table.row(b'user-001', timestamp=1700000000000) # Include timestamps in result row = table.row(b'user-001', columns=[b'info:name'], include_timestamp=True) value, ts = row[b'info:name'] print(value, ts) # => b'Alice' 1700123456789 # Non-existent row returns empty dict missing = table.row(b'no-such-key') print(missing) # => {} ``` ``` -------------------------------- ### Table.scan() Source: https://context7.com/python-happybase/happybase/llms.txt Returns a generator over matching rows based on various criteria. Supports range scans, prefix scans, column filters, server-side filters, result limits, and reverse scans. ```APIDOC ## `Table.scan()` — Scan Rows with Range / Prefix / Filter Returns a generator over matching rows. Supports start/stop keys, prefix scans, column filters, server-side filter strings, result limits, and reverse scans. ### Parameters - **row_start** (bytes, optional): The starting row key (inclusive). - **row_stop** (bytes, optional): The stopping row key (exclusive). - **row_prefix** (bytes, optional): Scan rows with this prefix. - **columns** (list of bytes, optional): A list of column names (e.g., b'family:qualifier') to retrieve. - **timestamp** (int, optional): Fetch data at or before this timestamp (in milliseconds). - **limit** (int, optional): The maximum number of rows to return. - **reverse** (bool, optional): If True, scan in reverse order (requires HBase 0.98+). - **filter** (bytes, optional): A server-side filter string (HBase 0.92+). - **batch_size** (int, optional): The number of rows to fetch in each batch from the server. - **sorted_columns** (bool, optional): If True, columns in the returned data dictionary will be sorted by qualifier. ### Return Value A generator yielding `(key, data)` tuples, where `data` is a dictionary of column values for that row. ### Example ```python # Full table scan (expensive — use only for small tables) for key, data in table.scan(): print(key, data) # Range scan: rows from b'2024-01-01' (inclusive) to b'2024-02-01' (exclusive) for key, data in table.scan(row_start=b'2024-01-01', row_stop=b'2024-02-01'): print(key, data[b'cf:type']) # Prefix scan: all rows starting with b'user-' for key, data in table.scan(row_prefix=b'user-'): print(key) # Column filter + timestamp + limit for key, data in table.scan( row_prefix=b'order-', columns=[b'info:status', b'info:amount'], timestamp=1700000000000, limit=100, ): print(key, data) # Server-side filter string (HBase 0.92+) for key, data in table.scan( filter=b"SingleColumnValueFilter('info', 'status', =, 'binary:ACTIVE')", ): print(key) # Reverse scan (HBase 0.98+) — row_start must be lexicographically after row_stop for key, data in table.scan(row_start=b'user-z', row_stop=b'user-a', reverse=True): print(key) # Sorted columns with controlled batch size from collections import OrderedDict for key, data in table.scan(row_prefix=b'prod-', sorted_columns=True, batch_size=500): assert isinstance(data, OrderedDict) ``` ``` -------------------------------- ### Access Table with Table Prefix Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md The `Connection.table()` method also respects the `table_prefix`. When you request a table, HappyBase automatically appends the prefix to the name before interacting with HBase. ```python table = connection.table('XYZ') # Operates on myproject_XYZ in HBase ``` -------------------------------- ### Test Framed Thrift Transport Source: https://github.com/python-happybase/happybase/blob/master/doc/development.md To test the framed Thrift transport mode, set the HAPPYBASE_TRANSPORT environment variable. ```sh export HAPPYBASE_TRANSPORT=framed ``` -------------------------------- ### Storing Data Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Stores a single cell of data in a table. The data should be a dictionary mapping column names to values. Optionally, a timestamp can be provided. ```APIDOC ## Table.put() ### Description Stores a single cell of data in a table. The data should be a dictionary mapping column names to values. Optionally, a timestamp can be provided. ### Method `put(row, data, timestamp=None)` ### Parameters #### Path Parameters - **row** (bytes) - Required - The row key to store data in. - **data** (dict) - Required - A dictionary mapping column names (bytes) to values (bytes). - **timestamp** (int) - Optional - The timestamp to associate with the data. If omitted, HBase defaults to the current system time. ### Request Example ```python table.put(b'row-key', {b'cf:col1': b'value1', b'cf:col2': b'value2'}) table.put(b'row-key', {b'cf:col1': b'value1'}, timestamp=123456789) ``` ``` -------------------------------- ### Scan Rows with a Stop Key Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Iterate over rows from the beginning of the table up to, but not including, a specified stop key. ```python for key, data in table.scan(row_stop=b'xyz'): print(key, data) ``` -------------------------------- ### Transactional Batch Operations Source: https://github.com/python-happybase/happybase/blob/master/doc/user.md Ensure transactional behavior for batches by using the 'transaction=True' argument with a context manager. The batch is only applied if no exceptions occur. ```python try: with table.batch(transaction=True) as b: b.put(b'row-key-1', {b'cf:col1': b'value1', b'cf:col2': b'value2'}) b.put(b'row-key-2', {b'cf:col2': b'value2', b'cf:col3': b'value3'}) b.put(b'row-key-3', {b'cf:col3': b'value3', b'cf:col4': b'value4'}) b.delete(b'row-key-4') raise ValueError("Something went wrong!") except ValueError: # error handling goes here; nothing is sent to HBase pass # when no error occurred, the transaction succeeded ```