### Run Example Tests with Pytest

Source: https://bionumpy.github.io/bionumpy/developer_guide/making_examples.html

Use this command to run your example tests locally during development. Ensure your example file ends with '_example.py' and contains functions starting with 'test_'.

```bash
pytest scripts/your_example.py

```

--------------------------------

### Example Script for Doctesting

Source: https://bionumpy.github.io/bionumpy/tutorials/example.html

A complete example script intended for use with pytest and doctesting. It includes a docstring and must be self-contained without auto-imports or test setup.

```python
"""
Example script used for documenting doctesting and other stuff
"""

a = 5
print(a)

```

--------------------------------

### Run tests in a single example file

Source: https://bionumpy.github.io/bionumpy/_sources/developer_guide/testing.rst.txt

Execute tests for a specific example file. This is useful for isolating and testing changes in individual example scripts.

```bash
pytest example/our_example.py
```

--------------------------------

### Download example data

Source: https://bionumpy.github.io/bionumpy/_sources/introduction.rst.txt

Download a sample FASTQ file from the BioNumPy GitHub repository for use in examples.

```bash
wget https://github.com/bionumpy/bionumpy/raw/main/example_data/big.fq.gz
```

--------------------------------

### Install BioNumPy

Source: https://bionumpy.github.io/bionumpy/_sources/index.rst.txt

Use pip to install the BioNumPy library.

```bash
pip install bionumpy
```

--------------------------------

### Run Example Tests with Pytest

Source: https://bionumpy.github.io/bionumpy/_sources/developer_guide/making_examples.rst.txt

Execute your BioNumPy example tests using the pytest command-line tool. Ensure your example file is named with the `_example.py` suffix and contains functions prefixed with `test_`.

```bash
pytest scripts/your_example.py
```

--------------------------------

### Run and Print Random Integer

Source: https://bionumpy.github.io/bionumpy/_sources/tutorials/example.rst.txt

This code snippet generates a random integer and prints it. It is part of a test setup for documentation examples.

```python
a = np.random.randint(3, 6)
print(a)
```

--------------------------------

### Install Development Dependencies

Source: https://bionumpy.github.io/bionumpy/_sources/developer_guide/setting_up_development_environment.rst.txt

Install additional development dependencies required for testing and other development tasks using the provided requirements file.

```bash
pip install -r requirements_dev.txt
```

--------------------------------

### Install BioNumPy Locally

Source: https://bionumpy.github.io/bionumpy/_sources/developer_guide/setting_up_development_environment.rst.txt

Clone the BioNumPy repository and install it locally in editable mode using pip. This ensures that changes made to the BioNumPy code are immediately reflected. It is recommended to use a virtual environment.

```bash
git clone git@github.com:bionumpy/bionumpy.git
cd bionumpy
pip install -e .
```

--------------------------------

### Example Script for Pytest

Source: https://bionumpy.github.io/bionumpy/_sources/tutorials/example.rst.txt

This script is intended to be included and tested by pytest. It requires complete code without auto-imports or test setup.

```python
import bionumpy as bnp

def main():
    # Example usage of bionumpy
    sequence = bionumpy.open(
        "/Users/runner/work/bionumpy/bionumpy/../scripts/example.fastq.gz"
    )
    print(sequence.sequence)


if __name__ == "__main__":
    main()
```

--------------------------------

### Run BioNumPy Tests

Source: https://bionumpy.github.io/bionumpy/_sources/developer_guide/setting_up_development_environment.rst.txt

Execute the BioNumPy test suite to verify the correct setup of your development environment. This includes unit tests, property testing, example testing, and doctesting.

```bash
./run_tests
```

--------------------------------

### Install Development Version of NpStructures

Source: https://bionumpy.github.io/bionumpy/_sources/developer_guide/setting_up_development_environment.rst.txt

Clone the npstructures repository and install its development branch using pip in editable mode. This is useful when BioNumPy's new features depend on unpublished changes in npstructures.

```bash
git clone git@github.com:knutdrand/npstructures.git
cd npstructures
git checkout dev
pip install -e .
```

--------------------------------

### Get kmers for sequences

Source: https://bionumpy.github.io/bionumpy/modules/sequences.html

Generates kmers from sequences encoded with an AlphabetEncoding. Use bnp.change_encoding if your sequences lack a suitable encoding. This example shows kmer extraction from a small set of sequences.

```python
import bionumpy as bnp
sequences = bnp.encoded_array.as_encoded_array(["ACTG", "AAA", "TTGGC"], bnp.DNAEncoding)
bnp.sequence.get_kmers(sequences, 3)
```

--------------------------------

### Create and Print Intervals

Source: https://bionumpy.github.io/bionumpy/modules/genome_arithmetics.html

Demonstrates how to create an Interval object and print its contents. Intervals are defined by chromosome, start, and stop positions.

```python
>>> intervals = Interval(["chr1", "chr1", "chr1"], [3, 5, 10], [8, 7, 12])
>>> print(intervals)
Interval with 3 entries
               chromosome                    start                     stop
                     chr1                        3                        8
                     chr1                        5                        7
                     chr1                       10                       12
```

--------------------------------

### Get Interval Sequences (General Path)

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/io/indexed_fasta.html

Retrieves sequences for genomic intervals by iterating through each interval and calculating offsets within the indexed FASTA file. This is used when the fast path is not applicable. It handles calculating start and stop positions, reading raw bytes, and deleting unwanted characters based on line length and modifications.

```python
lengths = []
        cur_offset = 0
        pre_alloc = np.empty((intervals.stop-intervals.start).sum(), dtype=np.uint8)
        alloc_offset = 0
        
        for interval in intervals:
            chromosome = interval.chromosome.to_string()
            idx = self._index[chromosome]
            lenb, rlen, lenc = (idx["lenb"], idx["rlen"], idx["lenc"])
            start_row = interval.start//lenc
            start_mod = interval.start % lenc
            start_offset = start_row*lenb+start_mod
            stop_row = interval.stop // lenc
            stop_offset = stop_row*lenb+interval.stop % lenc
            self._f_obj.seek(idx["offset"] + start_offset)
            lengths.append(stop_offset-start_offset-(stop_row-start_row))
            D = stop_offset-start_offset
            tmp = np.frombuffer(self._f_obj.read(stop_offset-start_offset),
                                dtype=np.uint8)
            tmp = np.delete(tmp, [lenb*(j+1)-1-start_mod
                                  for j in range(stop_row-start_row)])
            pre_alloc[alloc_offset:alloc_offset+tmp.size] = tmp
            alloc_offset += tmp.size
            cur_offset += stop_offset-start_offset
        assert alloc_offset == pre_alloc.size, (alloc_offset, pre_alloc.size)
        assert np.all(pre_alloc> 0), np.sum(pre_alloc==0)
        a = EncodedArray(pre_alloc, BaseEncoding)
        return EncodedRaggedArray(a, lengths)
```

--------------------------------

### Access Start Positions of Streamed Intervals

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/genomic_data/genomic_intervals.html

Provides access to the start positions of streamed genomic intervals.

```python
@property
def start(self):
        return self._start
```

--------------------------------

### Include External Code Example

Source: https://bionumpy.github.io/bionumpy/_sources/developer_guide/writing_documentation.rst.txt

This directive includes code from an external Python file directly into the documentation. Ensure the path is correct relative to the documentation source.

```rst
.. literalinclude:: /../scripts/your_example.py
```

--------------------------------

### Generate and Print Random Integer

Source: https://bionumpy.github.io/bionumpy/tutorials/example.html

Generates a random integer between 3 and 5 (inclusive) and prints it. This is a basic example for demonstrating code execution.

```python
a = np.random.randint(3, 6)
print(a)

```

--------------------------------

### Get Minimizers from DNA Sequences

Source: https://bionumpy.github.io/bionumpy/_sources/topics/kmers.rst.txt

Shows how to extract minimizers from DNA sequences. The kmer size and window size can be specified.

```python
bnp.sequence.get_minimizers(sequences, k=2, window_size=4)
```

--------------------------------

### Custom Encoding Output Example

Source: https://bionumpy.github.io/bionumpy/developer_guide/encodings.html

Demonstrates the output of encoding and decoding a sequence using a custom OneToOneEncoding. Shows the representation of the encoded object and its raw numpy array.

```text
ACT
encoded_array('ACT', MyCustomEncoding())
array([66, 68, 85], dtype=uint8)
ACT
encoded_array('ACT')
array([65, 67, 84], dtype=uint8)

```

--------------------------------

### Subsample Fasta/Fastq Reads

Source: https://bionumpy.github.io/bionumpy/_sources/tutorials/benchmarking_examples.rst.txt

Subsamples exactly half of the sequences from a fasta or fastq file. This example addresses the complexity of achieving exact subsampling when processing large files in chunks.

```python
import bionumpy as bnp


input_file = "example.fasta"
output_file = "output.fasta"


total_sequences = bnp.open(input_file).size
sequences_to_subsample = total_sequences // 2


with bnp.open(input_file) as f_in, bnp.open(output_file, "w") as f_out:
    subsampled_count = 0
    for chunk in f_in.read_chunks():
        num_to_take = min(sequences_to_subsample - subsampled_count, len(chunk))
        if num_to_take > 0:
            f_out.write(chunk[:num_to_take])
            subsampled_count += num_to_take
        if subsampled_count >= sequences_to_subsample:
            break
```

--------------------------------

### Build Documentation Locally

Source: https://bionumpy.github.io/bionumpy/_sources/developer_guide/writing_documentation.rst.txt

Run this command to build and test the documentation locally. It creates HTML files and opens them in your browser.

```bash
make docs
```

--------------------------------

### Extend and Shift Intervals

Source: https://bionumpy.github.io/bionumpy/_sources/source/intervals.rst.txt

Demonstrates extending the stop position of intervals and shifting both start and stop positions. Filters intervals based on their length.

```python
>>> import bionumpy as bnp
>>> intervals = bnp.open("example_data/small_interval.bed").read()
>>> extended_right = bnp.replace(intervals, stop=intervals.stop+10)
>>> shifted = bnp.replace(intervals, start=intervals.start+5, stop=intervals.stop+5)
>>> small = intervals[(intervals.stop-intervals.start)<50]
```

--------------------------------

### Get Genomic Location (Start, Stop, Center)

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/genomic_data/genomic_intervals.html

Retrieves the genomic location for the 'start', 'stop', or 'center' of the intervals. Handles stranded intervals by adjusting the stop position for '-' strands.

```python
if where in ('start', 'stop'):
    if not self.is_stranded():
        data = self._intervals
    else:
        location = np.where(self.strand == ('+' if where == 'start' else '-'),
                            self.start,
                            self.stop - 1)
        data = replace(self._intervals, start=location)
else:
    assert where == 'center'
    location = (self.start + self.stop) // 2
    data = replace(self._intervals, start=location)
return GenomicLocationGlobal.from_data(
    data, self._genome_context, is_stranded=self.is_stranded(),
    position_name='start')
```

--------------------------------

### Build Documentation Locally

Source: https://bionumpy.github.io/bionumpy/developer_guide/writing_documentation.html

Run this command to build and test the documentation locally. It creates HTML files in the docs_source/_build directory.

```bash
make docs

```

--------------------------------

### Slice EncodedRaggedArray (last N)

Source: https://bionumpy.github.io/bionumpy/source/sequences.html

Slice an EncodedRaggedArray to get the last N sequences. This example retrieves the last four sequences.

```python
>>> my_seqs[-4:] # last 4 sequences
encoded_ragged_array(['TGIVPMRM*S',
                      'CENVC',
                      'RSTWF',
                      'NTIFMC'], AlphabetEncoding('ACDEFGHIKLMNPQRSTVWY*'))
```

--------------------------------

### Slice EncodedRaggedArray (first N)

Source: https://bionumpy.github.io/bionumpy/source/sequences.html

Slice an EncodedRaggedArray to get the first N sequences. This example retrieves the first two sequences.

```python
>>> my_seqs[0:2] # first 2 sequences
  encoded_ragged_array(['LMSYAEVYGH',
                      'WKGVGKQNCAWSVNVH'], AlphabetEncoding('ACDEFGHIKLMNPQRSTVWY*'))
```

--------------------------------

### Slice EncodedRaggedArray (first N sequences)

Source: https://bionumpy.github.io/bionumpy/_sources/source/sequences.rst.txt

Slice an EncodedRaggedArray to retrieve a subset of sequences. This example gets the first two sequences.

```python
print(my_seqs[0:2])
```

--------------------------------

### Clean Documentation Build

Source: https://bionumpy.github.io/bionumpy/_sources/developer_guide/writing_documentation.rst.txt

Use this command to clean the existing documentation build files. This is useful before rebuilding.

```bash
make clean
```

--------------------------------

### Initialize MultiStream with Data Sources

Source: https://bionumpy.github.io/bionumpy/_sources/source/multiple_data_sources.rst.txt

Demonstrates how to initialize a MultiStream object by providing sequence lengths, an indexed reference genome, variant data, and interval data. This synchronizes the streams for aligned processing.

```python
import bionumpy as bnp
variants = bnp.open("example_data/few_variants.vcf").read_chunks()
intervals = bnp.open("example_data/small_interval.bed").read_chunks()
reference = bnp.open_indexed("example_data/small_genome.fa")

multistream = bnp.MultiStream(reference.get_contig_lengths(),
                               sequence=reference,
                               variants=variants,
                               intervals=intervals)
```

--------------------------------

### Analyze Pileup Data within Peaks

Source: https://bionumpy.github.io/bionumpy/_sources/topics/genomic_data.rst.txt

Extract and analyze pileup data within the regions defined by genomic intervals. This example shows how to get the maximum and mean pileup values for each peak.

```python
peak_pileups = pileup[intervals]
print(peak_pileups.max(axis=-1))
print(peak_pileups.mean(axis=-1))
```

--------------------------------

### Initialize IndexedFasta

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/io/indexed_fasta.html

Initializes an `IndexedFasta` object. It reads the FASTA index file and opens the FASTA file for binary reading. Requires a FASTA file and its corresponding .fai index file.

```python
class IndexedFasta:
    """
    Class representing an indexed fasta file.
    Behaves like dict of chrom names to sequences
    """

    def __init__(self, filename: Union[str, Path]):
        if isinstance(filename, str):
            filename = Path(filename)
        self._filename = filename
        self._index = read_index(filename.with_suffix(filename.suffix + ".fai"))
        self._f_obj = open(filename, "rb")
        self._index_table = FastaIdx.from_entry_tuples(
            [
                (name, var['rlen'], var['offset'], var['lenc'], var['lenb'])
                for name, var in self._index.items()
            ]  # if '_' not in name])
```

--------------------------------

### Instantiate and print a bnpdataclass

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/bnpdataclass/bnpdataclass.html

Instantiate the decorated class with data for each field. The printed output displays the data in a structured, table-like format.

```python
data = Person(["Knut", "Ivar", "Geir"], [35, 30, 40])
print(data)
```

--------------------------------

### Get minimizers for sequences

Source: https://bionumpy.github.io/bionumpy/modules/sequences.html

Computes minimizers for sequences encoded with an AlphabetEncoding. Specify the kmer size and window size for minimizer extraction. This example uses DNA sequences and extracts 2-mers within a window of 4.

```python
import bionumpy as bnp
sequences = bnp.encoded_array.as_encoded_array(["ACTG", "AAA", "TTGGC"], bnp.DNAEncoding)
bnp.sequence.get_minimizers(sequences, 2, 4)
```

--------------------------------

### Test Documentation Code

Source: https://bionumpy.github.io/bionumpy/_sources/developer_guide/writing_documentation.rst.txt

Run this command within the docs_source directory to automatically test all code examples embedded in the documentation using doctest. It verifies that the code output matches the expected output.

```bash
make doctest
```

--------------------------------

### Get kmers from FASTQ sequences

Source: https://bionumpy.github.io/bionumpy/modules/sequences.html

Extracts kmers of a specified size from sequences read from a FASTQ file. Sequences are converted to DNAEncoding before kmer extraction. This example retrieves the first three kmers of the first sequence.

```python
import bionumpy as bnp
sequences = bnp.open("example_data/big.fq.gz").read().sequence
sequences = bnp.change_encoding(sequences, bnp.DNAEncoding)
bnp.sequence.get_kmers(sequences, 31)[0, 0:3]  # first three kmers of first sequence
```

--------------------------------

### Convert FASTQ to FASTA using Bash

Source: https://bionumpy.github.io/bionumpy/_sources/manuscript/index.rst.txt

This bash command demonstrates a common bioinformatics task of converting FASTQ to FASTA format using a pipeline of standard Unix utilities.

```bash
zcat file.fastq.gz | paste - - - - | perl -ane 'print ">"$F[0]\n$F[2]\n";' | gzip -c > file.fasta.gz
```

--------------------------------

### DelimitedBufferWithInernalComments: Calculate Column Starts and Ends

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/io/delimited_buffers.html

Calculates column start and end positions, specifically handling lines that start with a comment character and are followed by a newline.

```python
    @classmethod
    def _calculate_col_starts_and_ends(cls, data, delimiters):
        comment_mask = (data[delimiters[:-1]] == '\n') & (data[delimiters[:-1] + 1] == cls.COMMENT)
        comment_mask = np.flatnonzero(comment_mask)
        start_delimiters = np.delete(delimiters, comment_mask)[:-1]
        end_delimiters = np.delete(delimiters, comment_mask + 1)
        if data[0] != cls.COMMENT:
            start_delimiters = np.insert(start_delimiters, 0, -1)
        else:
            end_delimiters = end_delimiters[1:]
        return start_delimiters + 1, end_delimiters
```

--------------------------------

### Create an Interval Dictionary from a Small BED File

Source: https://bionumpy.github.io/bionumpy/source/multiple_data_sources.html

Load a small BED file into memory and group its intervals by chromosome to create a dictionary. This dictionary can then be used with MultiStream, regardless of the original file's sort order.

```python
>>> intervals = bnp.open("example_data/small_interval.bed").read()
>>> interval_dict = dict(bnp.groupby(intervals, "chromosome"))
>>> interval_dict
{'0': Interval with 5 entries
               chromosome                    start                     stop
                        0                       13                       18
                        0                       37                       46
                        0                       62                       83
                        0                      105                      126
                        0                      129                      130, '1': Interval with 10 entries
               chromosome                    start                     stop
                        1                        3                       21
                        1                       41                       65
                        1                       91                      114
                        1                      131                      153
                        1                      157                      168
                        1                      174                      201
                        1                      213                      230
                        1                      240                      268
                        1                      290                      315
                        1                      319                      339, '2': Interval with 15 entries
               chromosome                    start                     stop
                        2                        2                       16
                        2                       44                       49
                        2                       77                      101
                        2                      108                      127
                        2                      135                      154
                        2                      163                      165
                        2                      173                      177
                        2                      201                      214
                        2                      242                      268
                        2                      292                      320, '3': Interval with 20 entries
               chromosome                    start                     stop
                        3                        7                       34
                        3                       58                       82
                        3                       95                      101
                        3                      130                      138
                        3                      150                      170
                        3                      188                      211
                        3                      234                      261
                        3                      283                      302
                        3                      325                      352
                        3                      353                      362}
```

--------------------------------

### Open and Read a Gzipped FASTQ File

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/io/files.html

Opens a gzipped FASTQ file and reads all its content into a `SequenceEntryWithQuality` object. This is useful for processing the entire file at once.

```python
import bionumpy as bnp
all_data = bnp.open("example_data/big.fq.gz").read()
print(all_data)
```

--------------------------------

### Create Genome from File

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/genomic_data/genome.html

Read genome information from a 'chrom.sizes' or 'fa.fai' file. If a FASTA file is provided, an index will be created if it doesn't exist, enabling sequence reading.

```python
>>> import bionumpy as bnp
>>> bnp.Genome.from_file('example_data/hg38.chrom.sizes')
Genome(['chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', '...'])
```

--------------------------------

### Get EncodedRaggedArray shape

Source: https://bionumpy.github.io/bionumpy/_sources/source/sequences.rst.txt

Access the `.shape` property of an EncodedRaggedArray to get the number of sequences and the lengths of each sequence.

```python
print(my_seqs.shape)
```

--------------------------------

### Import BioNumPy and read FASTQ data

Source: https://bionumpy.github.io/bionumpy/_sources/introduction.rst.txt

Import NumPy and BioNumPy, then open and read a FASTQ file into memory. The data is loaded as a SequenceEntryWithQuality object.

```python
import numpy as np
import bionumpy as bnp

# open the file
f = bnp.open("example_data/big.fq.gz")
data = f.read()  # reads the whole file into memory
print(data)
```

--------------------------------

### Clean Documentation Build Artifacts

Source: https://bionumpy.github.io/bionumpy/developer_guide/writing_documentation.html

Use this command to clean up previous build artifacts in the documentation directory.

```bash
make clean

```

--------------------------------

### GenomicIntervalsStreamed.start

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/genomic_data/genomic_intervals.html

Property to access the start of the intervals.

```APIDOC
## start

### Description
Property to access the start of the intervals.
```

--------------------------------

### get_location

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/genomic_data/genomic_intervals.html

Retrieves the genomic location (start, stop, or center) of the intervals.

```APIDOC
## get_location

### Description
Get the genomic location of either 'start', 'stop' or 'center' of the intervals.

### Parameters
- **where** (str): 'start', 'stop' or 'center'. Defaults to 'start'.

### Returns
- **GenomicLocation**: The genomic location.
```

--------------------------------

### Read Sequences from FASTQ File

Source: https://bionumpy.github.io/bionumpy/_sources/source/sequences.rst.txt

Use `bnp.open` to read sequence entries from a FASTQ file. The `read()` method returns all entries with their associated quality scores.

```python
entries = bnp.open("example_data/reads.fq").read()
```

--------------------------------

### EncodedCounts Initialization and Basic Operations

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/sequence/count_encoded.html

Demonstrates the initialization of EncodedCounts and basic operations like string representation, equality checks, and element access by label.

```python
from typing import List, Dict, Optional

import numpy as np
from numpy.typing import ArrayLike
from numbers import Number
from ..io.matrix_dump import Matrix
from ..util.typing import EncodedArrayLike
from ..encoded_array import EncodedArray


class EncodedCounts:
    """
    Class for storing counts of encoded data.
    """

    alphabet: list
    counts: np.ndarray
    row_names: list = None

    def __init__(self, alphabet, counts, row_names=None):
        self.counts = counts
        self.alphabet = alphabet
        self.row_names = row_names

    def __str__(self):
        return "\n".join(f"{c}: {n}" for c, n in zip(self.alphabet, self.counts.T))

    def __repr__(self):
        return f'''EncodedCounts(alphabet={repr(self.alphabet)}, counts={repr(self.counts)}, row_names={repr(self.row_names)})'''

    def __eq__(self, other):
        if self.alphabet != other.alphabet:
            return False
        if not np.all(self.counts == other.counts):
            return False
        return True

    def __getitem__(self, idx: str):
        return self.counts[..., self.alphabet.index(idx)]

    def __add__(self, other):
        if isinstance(other, Number):
            o_counts = other
        else:
            assert self.alphabet == other.alphabet
            o_counts = other.counts
        return self.__class__(self.alphabet, self.counts + o_counts)

    def __radd__(self, other):
        if isinstance(other, Number):
            o_counts = other
        else:
            assert self.alphabet == other.alphabet
            o_counts = other.counts
        return self.__class__(self.alphabet, self.counts + o_counts)

    # return dataclasses.replace(self, counts=self.counts+o_counts)

    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
        if method == "__call__":
            assert all(i.alphabet == self.alphabet for i in inputs if isinstance(i, EncodedCounts))
            assert all(i.alphabet == self.alphabet for i in kwargs.values() if isinstance(i, EncodedCounts))
            arrays = [i.counts if isinstance(i, EncodedCounts) else i for i in inputs]
            kwargs = {k: i.counts if isinstance(i, EncodedCounts) else i for k, i in kwargs.items()}
            return self.__class__(self.alphabet, getattr(ufunc, method)(*arrays, **kwargs))
        else:
            return NotImplemented
```

--------------------------------

### from_fields

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/genomic_data/genomic_intervals.html

Creates GenomicIntervals from separate arrays for chromosome, start, stop, and optionally strand.

```APIDOC
## from_fields

### Description
Create genomic intervals from fields.

### Parameters
- **genome_context** (GenomeContextBase) - The genome context.
- **chromosome** (StringArray) - An array of chromosome names.
- **start** (np.ndarray) - An array of start positions.
- **stop** (np.ndarray) - An array of stop positions.
- **strand** (Optional[EncodedArray]) - An optional array of strand information.

### Returns
- **GenomicIntervals** - A GenomicIntervals object created from the provided fields.
```

--------------------------------

### Compute Position Weight Matrix from File

Source: https://bionumpy.github.io/bionumpy/_sources/tutorials/position_weight_matrix.rst.txt

Reads a motif-PWM from a file and creates a PositionWeightMatrix object. Ensure the correct alphabet and counts are provided.

```python
from bionumpy.io.motifs import PositionWeightMatrix

# Read a motif-pwm from file
# The alphabet and counts are inferred from the file
pwm = PositionWeightMatrix.from_file("example.pwm")

# Print the PWM
print(pwm)
```

--------------------------------

### Query EncodedArray

Source: https://bionumpy.github.io/bionumpy/_sources/source/sequences.rst.txt

Perform NumPy-fast queries on EncodedArray objects. This example checks for equality with a character.

```python
print(encoded_array == "g")
```

--------------------------------

### Read Biological Files with bnp.open

Source: https://bionumpy.github.io/bionumpy/_sources/using_bionumpy_in_your_existing_project.rst.txt

Use `bnp.open` to read biological files like VCF. It automatically detects the file format. Iterate over chunks for efficiency, and then over individual entries within each chunk.

```python
import numpy as np
import bionumpy as bnp

# open your file, bnp.open automatically detects the file format
f = bnp.open("example_data/variants.vcf")
	# a chunk is an efficient representation of a chunk of many lines
for chunk in f.read_chunks():
		# we can iterate over the entries
    for single_entry in chunk.to_iter():
			print(single_entry)
			# and we can access things like, chromosome, position and so on
			position = single_entry.position
```

--------------------------------

### intersect

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/arithmetics/intervals.html

Computes the intersection of two sets of intervals. Assumes intervals are sorted by start position.

```APIDOC
## intersect

### Description
Computes the intersection of two sets of intervals. Assumes intervals are sorted by start position.

### Parameters
* **intervals_a** (Interval) - The first set of intervals.
* **intervals_b** (Interval) - The second set of intervals.

### Returns
* **Interval** - The intervals representing the intersection.
```

--------------------------------

### Plotting Read Pileup Around Transcription Start Sites (TSS)

Source: https://bionumpy.github.io/bionumpy/_sources/tutorials/genomic_data.rst.txt

Reads a wig file as a stream and plots the mean read pileup around transcription start sites. Requires the wig file to be alphabetically sorted by chromosome, which can be achieved by setting `sort_names=True` when creating the `Genome` object. Computations are lazily evaluated and must be triggered with `bnp.compute`.

```python
import numpy as np
import bionumpy as bnp
import plotly.graph_objects as go

def tss_plot(wig_filename: str, chrom_sizes_filename: str, annotation_filename: str):
    # Read genome and transcripts
    genome = bnp.Genome.from_file(chrom_sizes_filename, sort_names=True) # The wig file is alphbetically sorted
    annotation = genome.read_annotation(annotation_filename)
    transcripts = annotation.transcripts

    # Get transcript start locations and make windows around them
    tss = transcripts.get_location('start').sorted() # Make sure the transcripts are sorted alphabetically
    windows = tss.get_windows(flank=500)

    # Get mean read pileup within these windows and plot
    track = genome.read_track(wig_filename, stream=True)
    signals = track[windows]
    mean_signal = signals.mean(axis=0)
    signal = bnp.compute(mean_signal)  # Compute the actual value
    px.line(x=np.arange(-500, 500), y=signal.to_array(),
            title="Read pileup relative to TSS start",
            labels={"x": "Position relative to TSS start", "y": "Mean read pileup"}).show()
```

--------------------------------

### Create FASTA Index

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/io/indexed_fasta.html

Creates a FASTA index for a given FASTA file. This function reads the file using `bnp_open` with `FastaIdxBuffer` and returns the index as a `FastaIdx` object.

```python
def create_index(filename: str) -> FastaIdx:
    """Create a fasta index for a fasta file

    Parameters
    ----------
    filename : str
        Filename of the fasta file

    Returns
    -------
    FastaIdx
        Fasta index as bnpdataclass

    """

    reader = bnp_open(filename, buffer_type=FastaIdxBuffer)
    indice_builders = list(reader.read_chunks())
    offsets = np.cumsum([0] + [idx.byte_size[0] for idx in indice_builders])
    return np.concatenate([
        FastaIdx(
            idx.chromosome,
            idx.length,
            idx.start + offset,
            idx.characters_per_line,
            idx.line_length,
        )
        for idx, offset in zip(indice_builders, offsets)
    ])
```

--------------------------------

### Slice EncodedArray

Source: https://bionumpy.github.io/bionumpy/_sources/source/sequences.rst.txt

Use NumPy-like indexing to slice EncodedArray objects, for example, to trim sequence ends.

```python
print(encoded_array[2:-2])
```

--------------------------------

### Load Genomic Sequence

Source: https://bionumpy.github.io/bionumpy/_sources/topics/genomic_data.rst.txt

Loads a reference genome sequence from a FASTA file. Ensure the 'example_data/small_sequence.fa' file is accessible.

```python
genome_sequence = genome.read_sequence('example_data/small_sequence.fa')
print(genome_sequence)
```

--------------------------------

### GenomicIntervals._from_fields

Source: https://bionumpy.github.io/bionumpy/modules/genomic_data.html

Creates genomic intervals from provided fields including chromosome, start, stop, and optionally strand.

```APIDOC
## GenomicIntervals._from_fields

### Description
Create genomic intervals from fields.

### Parameters
- **genome_context** (GenomeContextBase) - The genome context.
- **chromosome** (StringArray) - Array of chromosome names.
- **start** (np.ndarray) - Array of start positions.
- **stop** (np.ndarray) - Array of stop positions.
- **strand** (EncodedArray | None, optional) - Array of strand information.

### Returns
GenomicIntervals
```

--------------------------------

### sort_intervals()

Source: https://bionumpy.github.io/bionumpy/modules/genome_arithmetics.html

Sorts intervals based on chromosome, start, and stop positions. Allows for a custom sort order.

```APIDOC
## sort_intervals()

### Description
Sort intervals on “chromosome”, “start”, “stop”.

### Parameters
- **intervals** (Interval) - Unsorted intervals.
- **chromosome_key_function** (callable) - A function to determine the chromosome key (defaults to a lambda function).
- **sort_order** (List[str]) - A list specifying the desired order of chromosomes.

### Returns
- **Interval** - Sorted intervals.
```

--------------------------------

### bnp_open

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/io/files.html

Opens a file for reading or writing, automatically detecting the appropriate buffer type based on the file extension. Supports lazy reading and chunked processing.

```APIDOC
## bnp_open

### Description
Open a `NpDataclassReader` file object, that can be used to read the file, either in chunks or completely. Files read in chunks can be used together with the `@bnp.streamable` decorator to call a function on all chunks in the file and optionally reduce the results.
If `mode="w"` it opens a writer object.

### Parameters

* **filename** (str) - Name of the file to open
* **mode** (str) - Either "w" or "r"
* **buffer_type** (FileBuffer) - A `FileBuffer` class to specify how the data in the file should be interpreted
* **lazy** (bool) - If True, the data will be read lazily, i. e. only when it is accessed. This is useful to speed up reading of large files, but it is more memory demanding

### Returns

* **NpDataclassReader** - A file reader object

### Examples

```python
import bionumpy as bnp

# Read all data from a gzipped FASTQ file
all_data = bnp.open("example_data/big.fq.gz").read()
print(all_data)

# Read the first chunk of a gzipped FASTQ file
first_chunk = bnp.open("example_data/big.fq.gz").read_chunk(300000)
print(first_chunk)
```
```

--------------------------------

### Define RawInterval Dataclass

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/arithmetics/intervals.html

Defines a simple dataclass for representing raw intervals with start and stop attributes.

```python
@bnpdataclass
class RawInterval:
    start: int
    stop: int
```

--------------------------------

### Get EncodedRaggedArray encoding

Source: https://bionumpy.github.io/bionumpy/_sources/source/sequences.rst.txt

Access the encoding scheme used for the sequences in an EncodedRaggedArray via the `.encoding` property.

```python
print(my_seqs.encoding)
```

--------------------------------

### Shift and Filter Intervals with NumPy

Source: https://bionumpy.github.io/bionumpy/source/intervals.html

Demonstrates basic interval manipulation using NumPy-like operations. Use for simple geometric transformations and filtering based on interval properties.

```python
import bionumpy as bnp
intervals = bnp.open("example_data/small_interval.bed").read()
extended_right = bnp.replace(intervals, stop=intervals.stop+10)
shifted = bnp.replace(intervals, start=intervals.start+5, stop=intervals.stop+5)
small = intervals[(intervals.stop-intervals.start)<50]
```

--------------------------------

### Read a Chunk from a Gzipped FASTQ File

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/io/files.html

Opens a gzipped FASTQ file and reads a specified number of entries as the first chunk. This is useful for processing large files in manageable parts.

```python
first_chunk = bnp.open("example_data/big.fq.gz").read_chunk(300000)
print(first_chunk)
```

--------------------------------

### Get EncodedRaggedArray lengths

Source: https://bionumpy.github.io/bionumpy/_sources/source/sequences.rst.txt

Retrieve the lengths of individual sequences within an EncodedRaggedArray using the `.lengths` property.

```python
print(my_seqs.lengths)
```

--------------------------------

### Get Genome Size

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/genomic_data/genome.html

Returns the total size of the genome in base pairs. This is a property of the `Genome` object.

```python
genome.size

```

--------------------------------

### bnp_open

Source: https://bionumpy.github.io/bionumpy/_sources/modules/io.rst.txt

Opens a file for reading. It supports automatic format detection based on filename suffix and allows overriding with a specified buffer type.

```APIDOC
## bnp_open

### Description
Opens a file for reading. It supports automatic format detection based on filename suffix and allows overriding with a specified buffer type.

### Method
(Not specified, typically a function call)

### Parameters
- **filename** (str) - Description of the file to open.
- **buffer_type** (optional) - Specifies the type of buffer to use for reading, overriding automatic detection.
```

--------------------------------

### Get context from BNPDataClass object

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/bnpdataclass/bnpdataclass.html

Retrieves a context value from the BNPDataClass object. This method is marked as deprecated.

```python
logger.warning(f'Deprecated method set_context in BNPDataClass')
        if not hasattr(self, '_context'):
            self._context = dict()
        return self._context[name]
```

--------------------------------

### Reverse Complement Fasta/Fastq Files

Source: https://bionumpy.github.io/bionumpy/tutorials/benchmarking_examples.html

Generates the reverse complement of sequences in a FASTA or FASTQ file and writes the result to a new file. Automatically detects the appropriate buffer type based on file extension.

```python
import bionumpy as bnp


def reverse_complement(input_filename: str, output_filename: str):
    """Reverse complements a fasta or fastq file and writes the result to a new file."""
    bt = lambda filename: (bnp.TwoLineFastaBuffer if filename.endswith(('fa', 'fa.gz')) else None)
    with bnp.open(output_filename, "w", buffer_type=bt(output_filename)) as outfile:
        for chunk in bnp.open(input_filename, buffer_type=bt(input_filename)).read_chunks():
            rc = bnp.sequence.get_reverse_complement(chunk.sequence)
            outfile.write(bnp.replace(chunk, sequence=rc))


def test():
    reverse_complement('example_data/big.fq.gz', 'example_data/big_rc.fq.gz')
    assert bnp.count_entries('example_data/big_rc.fq.gz') == bnp.count_entries('example_data/big.fq.gz')


if __name__ == '__main__':
    import sys
    reverse_complement(sys.argv[1], sys.argv[2])
```

--------------------------------

### Get Pileup of Intervals

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/arithmetics/intervals.html

Calculates the number of intervals that overlap each position on a chromosome or contig. This function is streamable.

```python
def get_pileup(intervals: Interval, chromosome_size: int) -> GenomicRunLengthArray:
    """Get the number of intervals that overlap each position of the chromosome/contig

```

--------------------------------

### global_intersect

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/arithmetics/intervals.html

Computes the intersection of two sets of intervals across all chromosomes. Intervals are sorted by chromosome and then by start position.

```APIDOC
## global_intersect

### Description
Computes the intersection of two sets of intervals across all chromosomes. Intervals are sorted by chromosome and then by start position.

### Parameters
* **intervals_b** (Interval) - The second set of intervals.
* **intervals_a** (Interval) - The first set of intervals.

### Returns
* **Interval** - The intervals representing the global intersection.
```

--------------------------------

### create_index Function

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/io/indexed_fasta.html

Creates a FASTA index for a given FASTA file.

```APIDOC
## def create_index(filename: str) -> FastaIdx

Create a fasta index for a fasta file

Parameters
----------
filename : str
    Filename of the fasta file

Returns
-------
FastaIdx
    Fasta index as bnpdataclass
```

--------------------------------

### OneLineBuffer.get_field_range_as_text

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/io/one_line_buffer.html

Retrieves a specified range of fields, specifically expecting a single field (start to start+1), and returns it as text.

```APIDOC
## get_field_range_as_text(start: int, end: int)

### Description
Get a range of fields as text. Asserts that the range is exactly one field.

### Parameters
* **start** (int) - The starting index of the field range.
* **end** (int) - The ending index of the field range. Must be start + 1.

### Returns
* EncodedRaggedArray - The specified field range as text.
```

--------------------------------

### Get Pileup from Streamed Intervals

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/genomic_data/genomic_intervals.html

Calculates the pileup for streamed genomic intervals. This is a method within the GenomicIntervalsStreamed class.

```python
def get_pileup(self) -> GenomicArray:
```

--------------------------------

### Create Info Dataclass from Header

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/io/vcf_buffers.html

Dynamically creates a BNPDataClass for INFO fields based on VCF header data. Handles different data types and list formats specified in the header.

```python
def translate_field_type(info_dict):
    t = info_dict['Type']
    number = info_dict['Number']
    is_list = (number is None) or (number > 1)
    if t == Optional[int] and is_list:
        return List[int]
    elif t == Optional[float] and is_list:
        return List[float]
    elif is_list:
        return str
    return t

def create_info_dataclass(header_data):
    if not header_data:
        return str
    header = parse_header(header_data)
    is_list = lambda val: (val['Number'] is None) or (val['Number'] > 1)
    is_int_list = lambda val: (val['Type'] == Optional[int]) and is_list(val)
    info_fields = [(key, translate_field_type(val)) for key, val in header.INFO.items()]
    dc = make_dataclass(info_fields, "InfoDataclass")
    return dc

```

--------------------------------

### Get Sorted Interval Stream

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/genomic_data/genomic_intervals.html

Returns a stream of sorted genomic intervals. Assumes intervals are already sorted.

```python
def get_sorted_stream(self):
        sorted_intervals = self.sorted()
        return self.from_interval_stream(iter([sorted_intervals]))
```

--------------------------------

### bnp_open()

Source: https://bionumpy.github.io/bionumpy/modules/io.html

Opens a file based on its suffix, returning a reader or writer object. Supports lazy reading for large files.

```APIDOC
## bnp_open()

### Description
Open a file according to its suffix. Opens a NpDataclassReader file object, which can be used to read the file either in chunks or completely. Files read in chunks can be used together with the @bnp.streamable decorator to call a function on all chunks in the file and optionally reduce the results. If mode="w", it opens a writer object.

### Method
`bnp_open(_filename : str_, _mode : str = None_, _buffer_type =None_, _lazy =None_)`

### Parameters
#### Path Parameters
- **filename** (str) - Name of the file to open
- **mode** (str) - Optional. Either "w" or "r".
- **buffer_type** (FileBuffer) - Optional. A FileBuffer class to specify how the data in the file should be interpreted.
- **lazy** (bool) - Optional. If True, the data will be read lazily, i.e. only when it is accessed. This is useful to speed up reading of large files, but it is more memory demanding.

### Returns
- NpDataclassReader - A file reader object
```

--------------------------------

### Get Raw Kmer Values

Source: https://bionumpy.github.io/bionumpy/_sources/tutorials/extracting_kmers_around_snps.rst.txt

Retrieves the raw integer (int64) encoded values of the alternative allele k-mers.

```python
raw_kmers = alt_kmers.raw()
print(raw_kmers[0:5])
```

--------------------------------

### Convert FASTQ to FASTA using BioNumPy

Source: https://bionumpy.github.io/bionumpy/_sources/manuscript/index.rst.txt

This Python snippet shows how to perform the FASTQ to FASTA conversion using BioNumPy, offering a more integrated and potentially more robust approach than bash scripting.

```python
with bnp.open("output.fasta") as out_file:
	outfile.write(bnp.open("input.fastq").read_chunks())
```

--------------------------------

### get_context

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/bnpdataclass/bnpdataclass.html

Gets a context value for the object, typically used for storing auxiliary information like header information.

```APIDOC
## get_context

### Description
Gets a context value for the object, typically used for storing auxiliary information like header information.

### Method
`get_context(self, name: str) -> Any`

### Parameters
#### Path Parameters
- **name** (str) - The name of the context variable to retrieve.

### Returns
The value of the context variable.

### Warning
Deprecated method `set_context` in `BNPDataClass`.

```

--------------------------------

### Initialize MultiStream with an Interval Dictionary

Source: https://bionumpy.github.io/bionumpy/source/multiple_data_sources.html

Construct a MultiStream object by passing an in-memory interval dictionary along with other data sources. This allows MultiStream to efficiently access interval data irrespective of its original file's sort order.

```python
>>> multistream = bnp.MultiStream(reference.get_contig_lengths(),
...                                sequence=reference,
...                                variants=variants,
...                                intervals=interval_dict)
```

--------------------------------

### Slice EncodedArray

Source: https://bionumpy.github.io/bionumpy/source/sequences.html

Use NumPy-like slicing to extract subsequences from an EncodedArray. This example trims the first and last two characters.

```python
>>> encoded_array[2:-2]
encoded_array('tggt')
```

--------------------------------

### Global Interval Intersection

Source: https://bionumpy.github.io/bionumpy/_modules/bionumpy/arithmetics/intervals.html

Computes the intersection of intervals across different chromosomes. Sorts intervals by chromosome and then by start position.

```python
@streamable()
def global_intersect(intervals_b, intervals_a):
    all_intervals = np.concatenate([intervals_a, intervals_b])
    all_intervals = all_intervals[np.lexsort((all_intervals.start, all_intervals.chromosome))]
    stops = all_intervals.stop[np.lexsort((all_intervals.stop, all_intervals.chromosome))]
    mask = stops[:-1] > all_intervals.start[1:]
    result = all_intervals[1:][mask]
    result.stop = stops[:-1][mask]
    return result
```