### Install pyBigWig using pip Source: https://github.com/deeptools/pybigwig/blob/master/README.md Install the pyBigWig library using pip. Ensure you have the necessary C libraries (libcurl, zlib) installed. ```bash pip install pyBigWig ``` -------------------------------- ### Install pyBigWig using Conda Source: https://github.com/deeptools/pybigwig/blob/master/README.md Install pyBigWig using Conda, specifying the conda-forge and bioconda channels for optimal compatibility. ```bash conda install pybigwig -c conda-forge -c bioconda ``` -------------------------------- ### Write fixed-step intervals Source: https://context7.com/deeptools/pybigwig/llms.txt Use addEntries() with both span and step parameters to write intervals where both the distance between starts and the span are constant. ```python import pyBigWig import tempfile import os ofile = tempfile.NamedTemporaryFile(delete=False) oname = ofile.name ofile.close() bw = pyBigWig.open(oname, "w") bw.addHeader([("chr1", 1000000)]) ``` -------------------------------- ### Getting File Header Source: https://context7.com/deeptools/pybigwig/llms.txt Retrieves metadata from the bigWig file header. ```APIDOC ## Getting File Header The `header()` method returns metadata about the bigWig file including version, zoom levels, value statistics, and coverage information. ### Method `header()` ### Request Example ```python import pyBigWig bw = pyBigWig.open("test.bw") header = bw.header() print(header) bw.close() ``` ### Response A dictionary containing header information: - **version** (int) - The bigWig file format version. - **nLevels** (int) - The number of zoom levels. - **nBasesCovered** (int) - The total number of bases covered by data. - **minVal** (float) - The minimum value in the file. - **maxVal** (float) - The maximum value in the file. - **sumData** (float) - The sum of all values in the file. - **sumSquared** (float) - The sum of the squares of all values in the file. ``` -------------------------------- ### Add Entries with Numpy Arrays Source: https://github.com/deeptools/pybigwig/blob/master/README.md When numpy support is enabled, `addEntries` can accept numpy arrays for chromosomes, starts, ends, and values. Ensure numpy is installed before pyBigWig. ```python >>> import pyBigWig >>> import numpy >>> bw = pyBigWig.open("/tmp/delete.bw", "w") >>> bw.addHeader([("1", 1000)], maxZooms=0) >>> chroms = np.array(["1"] * 10) >>> starts = np.array([0, 10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=np.int64) >>> ends = np.array([5, 15, 25, 35, 45, 55, 65, 75, 85, 95], dtype=np.int64) >>> values0 = np.array(np.random.random_sample(10), dtype=np.float64) >>> bw.addEntries(chroms, starts, ends=ends, values=values0) >>> bw.close() ``` -------------------------------- ### Get SQL Schema for bigBed Source: https://github.com/deeptools/pybigwig/blob/master/README.md Retrieves the SQL schema definition for a bigBed file, which describes the structure of the associated strings. ```APIDOC ## GET /SQL ### Description Retrieves the SQL schema definition for a bigBed file. This schema details the structure and meaning of the data fields within the associated strings of the bigBed entries. ### Method GET ### Endpoint `/SQL` ### Response #### Success Response (200) - **schema** (string) - The SQL schema definition for the bigBed file. #### Response Example ``` table RnaElements "BED6 + 3 scores for RNA Elements data" ( string chrom; uint chromStart; uint chromEnd; string name; uint score; char[1] strand; float level; float signif; uint score2; ) ``` ``` -------------------------------- ### Write bedGraph-like intervals Source: https://context7.com/deeptools/pybigwig/llms.txt Use addEntries() with the ends parameter to write intervals where each entry has an independent start and end position. ```python import pyBigWig import tempfile import os ofile = tempfile.NamedTemporaryFile(delete=False) oname = ofile.name ofile.close() bw = pyBigWig.open(oname, "w") bw.addHeader([("chr1", 1000000), ("chr2", 1500000)]) # bedGraph-like entries: chromosome list, start list, end list, value list bw.addEntries( ["chr1", "chr1", "chr1"], # chromosomes [0, 100, 125], # starts ends=[5, 120, 126], # ends values=[0.0, 1.0, 200.0] # values ) # Verify the data bw.close() bw = pyBigWig.open(oname) print(bw.intervals("chr1")) # ((0, 5, 0.0), (100, 120, 1.0), (125, 126, 200.0)) bw.close() os.remove(oname) ``` -------------------------------- ### Get SQL Schema for bigBed Entries Source: https://github.com/deeptools/pybigwig/blob/master/README.md The `SQL()` function retrieves the SQL schema definition for the bigBed file, which helps in parsing the associated string data. ```python >>> bb.SQL() table RnaElements "BED6 + 3 scores for RNA Elements data" ( string chrom; uint chromStart; uint chromEnd; string name; uint score; char[1] strand; float level; float signif; uint score2; ) ``` -------------------------------- ### Getting Chromosome Information Source: https://context7.com/deeptools/pybigwig/llms.txt Retrieves chromosome names and their lengths from the bigWig file. ```APIDOC ## Getting Chromosome Information The `chroms()` method returns a dictionary of chromosome names and their lengths. You can optionally query a specific chromosome to get just its length. ### Method `chroms(chrom=None)` ### Parameters - **chrom** (str) - Optional - The name of the chromosome to query. If `None`, all chromosomes are returned. ### Request Example ```python import pyBigWig bw = pyBigWig.open("test.bw") # Get all chromosomes and lengths all_chroms = bw.chroms() print(all_chroms) # {'1': 195471971, '10': 130694993} # Get a specific chromosome length chr1_len = bw.chroms("1") print(chr1_len) # 195471971 # Non-existent chromosome returns None missing = bw.chroms("chrX") print(missing) # None bw.close() ``` ### Response - If `chrom` is `None`: A dictionary where keys are chromosome names (str) and values are their lengths (int). - If `chrom` is specified: The length (int) of the specified chromosome, or `None` if the chromosome does not exist. ``` -------------------------------- ### Retrieve Intervals in a Range with pyBigWig Source: https://github.com/deeptools/pybigwig/blob/master/README.md Use the `intervals()` function to get all entries that overlap a given range. The function returns a list of tuples, each containing start position, end position, and the associated value. ```python >>> bw.intervals("1", 0, 3) ((0, 1, 0.10000000149011612), (1, 2, 0.20000000298023224), (2, 3, 0.30000001192092896)) ``` ```python >>> bw.intervals("1") ((0, 1, 0.10000000149011612), (1, 2, 0.20000000298023224), (2, 3, 0.30000001192092896), (100, 150, 1.399999976158142), (150, 151, 1.5)) ``` -------------------------------- ### Retrieve intervals from bigWig files Source: https://context7.com/deeptools/pybigwig/llms.txt Use the intervals() method to fetch raw interval data as tuples of (start, end, value) for specific ranges or entire chromosomes. ```python import pyBigWig bw = pyBigWig.open("test.bw") # Get intervals in a range intervals = bw.intervals("1", 0, 3) print(intervals) # ((0, 1, 0.10000000149011612), (1, 2, 0.20000000298023224), (2, 3, 0.30000001192092896)) # Get all intervals on a chromosome all_intervals = bw.intervals("1") print(all_intervals) # ((0, 1, 0.10000000149011612), (1, 2, 0.20000000298023224), (2, 3, 0.30000001192092896), # (100, 150, 1.399999976158142), (150, 151, 1.5)) bw.close() ``` -------------------------------- ### Retrieve Entry Positions Only from bigBed File Source: https://github.com/deeptools/pybigwig/blob/master/README.md To save memory when only interval positions are needed, use `entries()` with `withString=False`. This returns a list of tuples containing only the start and end positions. ```python >>> bb.entries('chr1', 10000000, 10020000, withString=False) [(10009333, 10009640), (10014007, 10014289), (10014373, 10024307)] ``` -------------------------------- ### Get Values as Numpy Array Source: https://github.com/deeptools/pybigwig/blob/master/README.md The `values()` accessor can return a numpy array when `numpy=True` is specified. This is useful for further numerical processing. ```python >>> bw = bw.open("/tmp/delete.bw") >>> bw.values('1', 0, 10, numpy=True) [ 0.74336642 0.74336642 0.74336642 0.74336642 0.74336642 nan nan nan nan nan] >>> type(bw.values('1', 0, 10, numpy=True)) ``` -------------------------------- ### Add Fixed-Step Entries to BigWig File Source: https://context7.com/deeptools/pybigwig/llms.txt Adds entries with fixed span and step to a bigWig file. Requires specifying chromosome, start position, values, span, and step. ```python import pyBigWig import os oname = "test.bw" bw = pyBigWig.open(oname, "w") bw.addHeader([("chr1", 1000), ("chr2", 1500)]) # Describes regions 900-920, 930-950, 960-980 bw.addEntries( "chr1", 900, # single start position values=[-5.0, -20.0, 25.0], # values span=20, # each entry covers 20 bases step=30 # each entry starts 30 bases after previous ) bw.close() bw = pyBigWig.open(oname) print(bw.intervals("chr1")) # ((900, 920, -5.0), (930, 950, -20.0), (960, 980, 25.0)) bw.close() os.remove(oname) ``` -------------------------------- ### Get all chromosome lengths Source: https://github.com/deeptools/pybigwig/blob/master/README.md Retrieve a dictionary-like object containing all chromosome names and their corresponding lengths from a bigWig file. Lengths are stored as long integers. ```python bw = pyBigWig.open("test/test.bw") bw.chroms() ``` -------------------------------- ### Retrieve Entries from bigBed Source: https://github.com/deeptools/pybigwig/blob/master/README.md Retrieve entries from a bigBed file within a specified range. Entries include start, end, and an associated string. The string can be optionally excluded to save memory. ```APIDOC ## GET /entries ### Description Retrieves entries from a bigBed file within a specified genomic range. Each entry consists of start, end, and an associated string. The `withString` parameter can be set to `False` to exclude the associated string. ### Method GET ### Endpoint `/entries` ### Parameters #### Path Parameters - **chromosome** (string) - Required - The chromosome identifier. - **start** (integer) - Required - The start position of the range (0-based). - **end** (integer) - Required - The end position of the range (0-based). - **withString** (boolean) - Optional - Defaults to `True`. If `False`, the associated string for each entry is omitted. ### Request Example ```python bb.entries('chr1', 10000000, 10020000) bb.entries('chr1', 10000000, 10020000, withString=False) ``` ### Response #### Success Response (200) - **entries** (list of tuples) - A list where each tuple contains (start, end, string) for the entries within the range. If `withString=False`, tuples contain only (start, end). #### Response Example ``` [(10009333, 10009640, '61035\t130\t-\t0.026\t0.42\t404'), (10014007, 10014289, '61047\t136\t-\t0.029\t0.42\t404'), (10014373, 10024307, '61048\t630\t-\t5.420\t0.00\t2672399')] [(10009333, 10009640), (10014007, 10014289), (10014373, 10024307)] ``` ``` -------------------------------- ### Get length of a specific chromosome Source: https://github.com/deeptools/pybigwig/blob/master/README.md Query the length of a specific chromosome from a bigWig file. If the chromosome does not exist, no output is produced. ```python bw = pyBigWig.open("test/test.bw") bw.chroms("1") ``` -------------------------------- ### Retrieve Entries from bigBed File Source: https://github.com/deeptools/pybigwig/blob/master/README.md Use the `entries()` function to access string data associated with intervals in a bigBed file. The function returns a list of tuples, where each tuple contains start, end, and the associated string. ```python >>> bb = pyBigWig.open("https://www.encodeproject.org/files/ENCFF001JBR/@@download/ENCFF001JBR.bigBed") >>> bb.entries('chr1', 10000000, 10020000) [(10009333, 10009640, '61035\t130\t-\t0.026\t0.42\t404'), (10014007, 10014289, '61047\t136\t-\t0.029\t0.42\t404'), (10014373, 10024307, '61048\t630\t-\t5.420\t0.00\t2672399')] ``` -------------------------------- ### Add BedGraph-like Entries to BigWig Source: https://github.com/deeptools/pybigwig/blob/master/README.md Use this method to add entries in the bedGraph format. Ensure entries are ordered by chromosome and start position. The `validate=False` option can be used to disable order validation for performance, but requires manual assurance of correct ordering. ```python >>> bw.addEntries(["chr1", "chr1", "chr1"], [0, 100, 125], ends=[5, 120, 126], values=[0.0, 1.0, 200.0]) ``` ```python >>> bw.addEntries(["chr1", "chr1", "chr1"], [100, 0, 125], ends=[120, 5, 126], values=[0.0, 1.0, 200.0], validate=False) ``` -------------------------------- ### Retrieve Intervals from bigWig Source: https://github.com/deeptools/pybigwig/blob/master/README.md Retrieve all entries overlapping a specified range in a bigWig file. If start and end positions are omitted, all intervals on the chromosome are returned. ```APIDOC ## GET /intervals ### Description Retrieves all entries in a bigWig file that overlap with a specified genomic range. ### Method GET ### Endpoint `/intervals` ### Parameters #### Path Parameters - **chromosome** (string) - Required - The chromosome identifier. - **start** (integer) - Optional - The start position of the range (0-based). - **end** (integer) - Optional - The end position of the range (0-based). ### Request Example ```python bw.intervals("1", 0, 3) bw.intervals("1") ``` ### Response #### Success Response (200) - **intervals** (list of tuples) - A list where each tuple contains (start, end, value) for the overlapping intervals. #### Response Example ``` ((0, 1, 0.10000000149011612), (1, 2, 0.20000000298023224), (2, 3, 0.30000001192092896)) ((0, 1, 0.10000000149011612), (1, 2, 0.20000000298023224), (2, 3, 0.30000001192092896), (100, 150, 1.399999976158142), (150, 151, 1.5)) ``` ``` -------------------------------- ### Compute Statistics on a Genomic Range Source: https://github.com/deeptools/pybigwig/blob/master/README.md Calculate summary statistics (mean, max, min, coverage, std) for a specified genomic range. The 'type' parameter defaults to 'mean'. Use 'nBins' to divide the range into bins for statistics. If start and end are omitted, the entire chromosome is used. ```python >>> bw.stats("1", 0, 3) [0.2000000054637591] ``` ```python >>> bw.stats("1", 0, 3, type="max") [0.30000001192092896] ``` ```python >>> bw.stats("1",99, 200, type="max", nBins=2) [1.399999976158142, 1.5] ``` ```python >>> bw.stats("1") [1.3351851569281683] ``` -------------------------------- ### Opening Files Source: https://context7.com/deeptools/pybigwig/llms.txt Demonstrates how to open local and remote bigWig/bigBed files for reading or writing, including the use of context managers. ```APIDOC ## Opening Files The `pyBigWig.open()` function opens local or remote bigWig/bigBed files for reading or writing. Remote files are supported via HTTP, HTTPS, and FTP URLs. Files opened for writing require adding a header before adding entries. ### Method `pyBigWig.open(filename, mode='r', **kwargs)` ### Parameters - **filename** (str) - Required - Path or URL to the bigWig/bigBed file. - **mode** (str) - Optional - 'r' for read (default), 'w' for write. ### Request Example ```python import pyBigWig # Open a local bigWig file for reading bw = pyBigWig.open("test.bw") # Open a remote bigWig file bw_remote = pyBigWig.open("http://example.com/data.bw") # Open a file for writing bw_write = pyBigWig.open("output.bw", "w") # Using context manager with pyBigWig.open("test.bw") as bw: chroms = bw.chroms() print(chroms) bw.close() ``` ### Response Returns a `pyBigWig` object for file operations. ``` -------------------------------- ### Read bigWig file data Source: https://github.com/deeptools/pybigwig/blob/master/libBigWig/README.md Demonstrates how to initialize the library, open a bigWig file, and retrieve intervals or statistics from a specific genomic range. ```C #include "bigWig.h" int main(int argc, char *argv[]) { bigWigFile_t *fp = NULL; bwOverlappingIntervals_t *intervals = NULL; double *stats = NULL; if(argc != 2) { fprintf(stderr, "Usage: %s {file.bw|URL://path/file.bw}\n", argv[0]); return 1; } //Initialize enough space to hold 128KiB (1<<17) of data at a time if(bwInit(1<<17) != 0) { fprintf(stderr, "Received an error in bwInit\n"); return 1; } //Open the local/remote file fp = bwOpen(argv[1], NULL, "r"); if(!fp) { fprintf(stderr, "An error occurred while opening %s\n", argv[1]); return 1; } //Get values in a range (0-based, half open) without NAs intervals = bwGetValues(fp, "chr1", 10000000, 10000100, 0); bwDestroyOverlappingIntervals(intervals); //Free allocated memory //Get values in a range (0-based, half open) with NAs intervals = bwGetValues(fp, "chr1", 10000000, 10000100, 1); bwDestroyOverlappingIntervals(intervals); //Free allocated memory //Get the full intervals that overlap intervals = bwGetOverlappingIntervals(fp, "chr1", 10000000, 10000100); bwDestroyOverlappingIntervals(intervals); //Get an example statistic - standard deviation //We want ~4 bins in the range stats = bwStats(fp, "chr1", 10000000, 10000100, 4, dev); if(stats) { printf("chr1:10000000-10000100 std. dev.: %f %f %f %f\n", stats[0], stats[1], stats[2], stats[3]); free(stats); } bwClose(fp); bwCleanup(); return 0; } ``` -------------------------------- ### Iterate over bigWig intervals Source: https://github.com/deeptools/pybigwig/blob/master/libBigWig/README.md Demonstrates the standard pattern for initializing an iterator, traversing blocks, and cleaning up resources. ```c iter = bwOverlappingIntervalsIterator(fp, "chr1", 0, 10000000, 5); while(iter->data) { //Do stuff with iter->intervals iter = bwIteratorNext(iter); } bwIteratorDestroy(iter); ``` -------------------------------- ### Read bigBed entries and schema Source: https://context7.com/deeptools/pybigwig/llms.txt Retrieve bigBed entries using entries() and inspect the file schema with SQL(). Setting withString=False improves memory efficiency when the entry string is not required. ```python import pyBigWig bb = pyBigWig.open("test.bigBed") # Get entries with associated strings entries = bb.entries("chr1", 10000000, 10020000) print(entries) # [(10009333, 10009640, '61035\t130\t-\t0.026\t0.42\t404'), # (10014007, 10014289, '61047\t136\t-\t0.029\t0.42\t404'), # (10014373, 10024307, '61048\t630\t-\t5.420\t0.00\t2672399')] # Get entries without strings (more memory efficient) entries_no_str = bb.entries("chr1", 10000000, 10020000, withString=False) print(entries_no_str) # [(10009333, 10009640), (10014007, 10014289), (10014373, 10024307)] # Get the SQL schema to understand entry fields sql = bb.SQL() print(sql) # table RnaElements # "BED6 + 3 scores for RNA Elements data" # ( # string chrom; "Reference sequence chromosome or scaffold" # uint chromStart; "Start position in chromosome" # uint chromEnd; "End position in chromosome" # string name; "Name of item" # uint score; "Normalized score from 0-1000" # char[1] strand; "+ or - or . for unknown" # float level; "Expression level such as RPKM or FPKM..." # float signif; "Statistical significance such as IDR..." # uint score2; "Additional measurement/count..." # ) bb.close() ``` -------------------------------- ### Import pyBigWig Source: https://github.com/deeptools/pybigwig/blob/master/README.md Import the pyBigWig library to begin using its functionalities for bigWig and bigBed file manipulation. ```python import pyBigWig ``` -------------------------------- ### Open a local bigWig file for writing Source: https://github.com/deeptools/pybigwig/blob/master/README.md Open a local bigWig file for writing. Files opened in write mode cannot be queried for intervals or statistics. A header must be added subsequently. ```python bw = pyBigWig.open("test/output.bw", "w") ``` -------------------------------- ### Initialize bigWig file header Source: https://context7.com/deeptools/pybigwig/llms.txt The addHeader() method is required to define chromosome names and lengths before writing any data to a new bigWig file. ```python import pyBigWig import tempfile import os # Create a new bigWig file ofile = tempfile.NamedTemporaryFile(delete=False) oname = ofile.name ofile.close() bw = pyBigWig.open(oname, "w") # Add header with chromosome list (name, length) bw.addHeader([("chr1", 1000000), ("chr2", 1500000)]) # Or with custom zoom levels (0 = no zoom levels) # bw.addHeader([("chr1", 1000000), ("chr2", 1500000)], maxZooms=0) bw.close() os.remove(oname) ``` -------------------------------- ### Opening bigWig and bigBed files Source: https://context7.com/deeptools/pybigwig/llms.txt Use pyBigWig.open() to access local or remote files. Context managers are recommended for automatic file closure. ```python import pyBigWig # Open a local bigWig file for reading bw = pyBigWig.open("test.bw") # Open a remote bigWig file bw_remote = pyBigWig.open("http://example.com/data.bw") # Open a file for writing bw_write = pyBigWig.open("output.bw", "w") # Using context manager with pyBigWig.open("test.bw") as bw: chroms = bw.chroms() print(chroms) # {'1': 195471971, '10': 130694993} bw.close() ``` -------------------------------- ### Open a local bigWig file for reading Source: https://github.com/deeptools/pybigwig/blob/master/README.md Open a local bigWig file for reading. If the file does not exist, an error will be shown and None will be returned. By default, files are opened in read mode. ```python bw = pyBigWig.open("test/test.bw") ``` -------------------------------- ### Write bigWig file data Source: https://github.com/deeptools/pybigwig/blob/master/libBigWig/README.md Shows how to create a new bigWig file, define chromosome lists, and write intervals using bedGraph-like, span, and fixed-step formats. ```C #include "bigWig.h" int main(int argc, char *argv[]) { bigWigFile_t *fp = NULL; char *chroms[] = {"1", "2"}; char *chromsUse[] = {"1", "1", "1"}; uint32_t chrLens[] = {1000000, 1500000}; uint32_t starts[] = {0, 100, 125, 200, 220, 230, 500, 600, 625, 700, 800, 850}; uint32_t ends[] = {5, 120, 126, 205, 226, 231}; float values[] = {0.0f, 1.0f, 200.0f, -2.0f, 150.0f, 25.0f, 0.0f, 1.0f, 200.0f, -2.0f, 150.0f, 25.0f, -5.0f, -20.0f, 25.0f, -5.0f, -20.0f, 25.0f}; if(bwInit(1<<17) != 0) { fprintf(stderr, "Received an error in bwInit\n"); return 1; } fp = bwOpen("example_output.bw", NULL, "w"); if(!fp) { fprintf(stderr, "An error occurred while opening example_output.bw for writingn\n"); return 1; } //Allow up to 10 zoom levels, though fewer will be used in practice if(bwCreateHdr(fp, 10)) goto error; //Create the chromosome lists fp->cl = bwCreateChromList(chroms, chrLens, 2); if(!fp->cl) goto error; //Write the header if(bwWriteHdr(fp)) goto error; //Some example bedGraph-like entries if(bwAddIntervals(fp, chromsUse, starts, ends, values, 3)) goto error; //We can continue appending similarly formatted entries //N.B. you can't append a different chromosome (those always go into different if(bwAppendIntervals(fp, starts+3, ends+3, values+3, 3)) goto error; //Add a new block of entries with a span. Since bwAdd/AppendIntervals was just used we MUST create a new block if(bwAddIntervalSpans(fp, "1", starts+6, 20, values+6, 3)) goto error; //We can continue appending similarly formatted entries if(bwAppendIntervalSpans(fp, starts+9, values+9, 3)) goto error; //Add a new block of fixed-step entries if(bwAddIntervalSpanSteps(fp, "1", 900, 20, 30, values+12, 3)) goto error; //The start is then 760, since that's where the previous step ended if(bwAppendIntervalSpanSteps(fp, values+15, 3)) goto error; //Add a new chromosome chromsUse[0] = "2"; ``` -------------------------------- ### Write and Close bigWig File Source: https://github.com/deeptools/pybigwig/blob/master/libBigWig/README.md This C code snippet demonstrates writing intervals to a bigWig file and then closing it, which triggers the creation of zoom levels. Error handling is included. ```c chromsUse[1] = "2"; chromsUse[2] = "2"; if(bwAddIntervals(fp, chromsUse, starts, ends, values, 3)) goto error; //Closing the file causes the zoom levels to be created bwClose(fp); bwCleanup(); return 0; error: fprintf(stderr, "Received an error somewhere!\n"); bwClose(fp); bwCleanup(); return 1; } ``` -------------------------------- ### Detecting file types Source: https://context7.com/deeptools/pybigwig/llms.txt Use isBigWig() and isBigBed() to programmatically verify the format of an opened file. ```python import pyBigWig bw = pyBigWig.open("test.bw") print(bw.isBigWig()) # True print(bw.isBigBed()) # False bb = pyBigWig.open("test.bigBed") print(bb.isBigWig()) # False print(bb.isBigBed()) # True bw.close() bb.close() ``` -------------------------------- ### Copy BigWig File Contents Source: https://context7.com/deeptools/pybigwig/llms.txt Demonstrates reading a bigWig file, copying its header and all intervals to a new file, and verifying the integrity of the copy. This involves iterating through chromosomes and intervals, and using addHeader and addEntries. ```python import pyBigWig import tempfile import os # Open source file bw_src = pyBigWig.open("test.bw") # Create output file ofile = tempfile.NamedTemporaryFile(delete=False) oname = ofile.name ofile.close() bw_dst = pyBigWig.open(oname, "w") # Copy chromosome list preserving order chroms = [(chrom, bw_src.chroms(chrom)) for chrom in bw_src.chroms()] bw_dst.addHeader(chroms, maxZooms=1) # Copy all intervals for chrom_name, chrom_len in chroms: intervals = bw_src.intervals(chrom_name) if intervals: chrom_list = [chrom_name] * len(intervals) starts = [i[0] for i in intervals] ends = [i[1] for i in intervals] values = [i[2] for i in intervals] bw_dst.addEntries(chrom_list, starts, ends=ends, values=values) bw_dst.close() # Verify the copy bw_dst = pyBigWig.open(oname) assert bw_src.header() == bw_dst.header() assert bw_src.chroms() == bw_dst.chroms() for chrom_name, _ in chroms: assert bw_src.intervals(chrom_name) == bw_dst.intervals(chrom_name) print("File copied successfully!") bw_src.close() bw_dst.close() os.remove(oname) ``` -------------------------------- ### Add Variable Step Entries to BigWig Source: https://github.com/deeptools/pybigwig/blob/master/README.md Add entries using a variable step size between positions. This format is similar to the wiggle file format with `variableStep`. ```python >>> bw.addEntries("chr1", [500, 600, 635], values=[-2.0, 150.0, 25.0], span=20) ``` -------------------------------- ### Accessing file header metadata Source: https://context7.com/deeptools/pybigwig/llms.txt The header() method provides access to file-level metadata such as version, zoom levels, and global statistics. ```python import pyBigWig bw = pyBigWig.open("test.bw") header = bw.header() print(header) # { # 'version': 4, # 'nLevels': 1, # 'nBasesCovered': 154, # 'minVal': 0, # 'maxVal': 2, # 'sumData': 272, # 'sumSquared': 500 # } bw.close() ``` -------------------------------- ### Write variable-step intervals Source: https://context7.com/deeptools/pybigwig/llms.txt Use addEntries() with the span parameter to write intervals where positions vary but the span (length) of each interval is constant. ```python import pyBigWig import tempfile import os ofile = tempfile.NamedTemporaryFile(delete=False) oname = ofile.name ofile.close() bw = pyBigWig.open(oname, "w") bw.addHeader([("chr1", 1000000)]) # Variable-step: single chromosome, start list, span, value list # Describes regions 500-520, 600-620, 635-655 bw.addEntries( "chr1", [500, 600, 635], # starts values=[-2.0, 150.0, 25.0], # values span=20 # each entry covers 20 bases ) bw.close() bw = pyBigWig.open(oname) print(bw.intervals("chr1")) # ((500, 520, -2.0), (600, 620, 150.0), (635, 655, 25.0)) bw.close() os.remove(oname) ``` -------------------------------- ### Add Fixed Step Entries to BigWig Source: https://github.com/deeptools/pybigwig/blob/master/README.md Add entries with a fixed step and span, corresponding to the `fixedStep` wiggle format. This is efficient for uniformly spaced data. ```python >>> bw.addEntries("chr1", 900, values=[-5.0, -20.0, 25.0], span=20, step=30) ``` -------------------------------- ### Print BigWig Header Information Source: https://github.com/deeptools/pybigwig/blob/master/README.md Retrieve and display the header information of a bigWig file as a Python dictionary. This includes version, zoom levels, base coverage, and value statistics. This method also works for bigBed files. ```python >>> bw.header() {'maxVal': 2L, 'sumData': 272L, 'minVal': 0L, 'version': 4L, 'sumSquared': 500L, 'nLevels': 1L, 'nBasesCovered': 154L} ``` -------------------------------- ### Open a remote bigBed file Source: https://github.com/deeptools/pybigwig/blob/master/README.md Open a remote bigBed file using its URL for read access. The mode parameter is ignored for bigBed files. ```python bb = pyBigWig.open("https://www.encodeproject.org/files/ENCFF001JBR/@@download/ENCFF001JBR.bigBed") ``` -------------------------------- ### Check if a file is bigWig or bigBed Source: https://github.com/deeptools/pybigwig/blob/master/libBigWig/README.md Use bwIsBigWig and bbIsBigBed to determine the type of an input file. These functions rely on the file's magic number. ```c if(bwIsBigWig(input_file_name, NULL)) { //do something } else if(bbIsBigBed(input_file_name, NULL)) { //do something else } else { //handle unknown input } ``` -------------------------------- ### Write BigWig with NumPy Arrays Source: https://context7.com/deeptools/pybigwig/llms.txt Efficiently writes entries to a bigWig file using NumPy arrays. Requires pyBigWig to be compiled with NumPy support. Ensure NumPy arrays are of appropriate dtypes (e.g., int64 for starts/ends, float64 for values). ```python import pyBigWig import tempfile import os import numpy as np # Check if NumPy support is available if pyBigWig.numpy != 1: print("NumPy support not available") else: ofile = tempfile.NamedTemporaryFile(delete=False) oname = ofile.name ofile.close() bw = pyBigWig.open(oname, "w") bw.addHeader([("chr1", 1000), ("chr2", 1500)]) # Create NumPy arrays for entries chroms = np.array(["chr1"] * 10) starts = np.array([0, 10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=np.int64) ends = np.array([5, 15, 25, 35, 45, 55, 65, 75, 85, 95], dtype=np.int64) values = np.random.random_sample(10).astype(np.float64) bw.addEntries(chroms, starts, ends=ends, values=values) bw.close() # Read back and verify bw = pyBigWig.open(oname) for interval in bw.intervals("chr1"): print(interval) bw.close() os.remove(oname) ``` -------------------------------- ### File Type Detection Source: https://context7.com/deeptools/pybigwig/llms.txt Provides methods to determine if a file is a bigWig or bigBed format. ```APIDOC ## File Type Detection The `isBigWig()` and `isBigBed()` methods determine the file type when working with files that could be either format. ### Methods - `isBigWig()`: Returns `True` if the file is a bigWig, `False` otherwise. - `isBigBed()`: Returns `True` if the file is a bigBed, `False` otherwise. ### Request Example ```python import pyBigWig bw = pyBigWig.open("test.bw") print(bw.isBigWig()) # True print(bw.isBigBed()) # False bb = pyBigWig.open("test.bigBed") print(bb.isBigWig()) # False print(bb.isBigBed()) # True bw.close() bb.close() ``` ``` -------------------------------- ### Check Remote File Access Support Source: https://github.com/deeptools/pybigwig/blob/master/README.md Verify if pyBigWig can access remote files by checking the `pyBigWig.remote` attribute. A value of 1 indicates that remote access is supported. ```python >>> import pyBigWig >>> pyBigWig.remote ``` -------------------------------- ### Add Header to bigWig File without Zoom Levels Source: https://github.com/deeptools/pybigwig/blob/master/README.md To create a bigWig file without zoom levels, use `addHeader` with `maxZooms=0`. Note that this may cause compatibility issues with tools like IGV that expect at least one zoom level. ```python >>> bw.addHeader([("chr1", 1000000), ("chr2", 1500000)], maxZooms=0) ``` -------------------------------- ### Computing genomic summary statistics Source: https://context7.com/deeptools/pybigwig/llms.txt The stats() method calculates metrics like mean, min, max, and standard deviation over specified intervals or bins. ```python import pyBigWig bw = pyBigWig.open("test.bw") # Mean value over a range (default) mean_val = bw.stats("1", 0, 3) print(mean_val) # [0.2000000054637591] # Maximum value max_val = bw.stats("1", 0, 3, type="max") print(max_val) # [0.30000001192092896] # Minimum value min_val = bw.stats("1", 0, 3, type="min") print(min_val) # [0.10000000149011612] # Coverage (fraction of bases with values) cov = bw.stats("1", 0, 10, type="coverage") print(cov) # [0.30000000000000004] # Standard deviation std_val = bw.stats("1", 0, 3, type="std") print(std_val) # [0.10000000521540645] # Multiple bins binned = bw.stats("1", 99, 200, type="max", nBins=2) print(binned) # [1.399999976158142, 1.5] # Entire chromosome whole_chr = bw.stats("1") print(whole_chr) # [1.3351851569281683] # Exact statistics (ignores zoom levels for precise values) exact = bw.stats("1", 89294, 91629, exact=True) print(exact) # [0.22213841940688142] bw.close() ``` -------------------------------- ### Retrieving per-base values Source: https://context7.com/deeptools/pybigwig/llms.txt The values() method returns data for specific positions, with support for NumPy arrays and NaN handling for missing data. ```python import pyBigWig bw = pyBigWig.open("test.bw") # Get values for a range vals = bw.values("1", 0, 3) print(vals) # [0.10000000149011612, 0.20000000298023224, 0.30000001192092896] # Positions without values return nan vals_with_nan = bw.values("1", 0, 4) print(vals_with_nan) # [0.10000000149011612, 0.20000000298023224, 0.30000001192092896, nan] # Return as NumPy array (if compiled with NumPy support) if pyBigWig.numpy == 1: import numpy as np vals_np = bw.values("1", 0, 100, numpy=True) print(type(vals_np)) # bw.close() ``` -------------------------------- ### Add Header to bigWig File Source: https://github.com/deeptools/pybigwig/blob/master/README.md Before adding entries to a bigWig file opened for writing, a header must be added. The header specifies chromosomes and their sizes in order. Case sensitivity and naming conventions (e.g., UCSC vs. Ensembl) are important. ```python >>> bw.addHeader([("chr1", 1000000), ("chr2", 1500000)]) ``` -------------------------------- ### Computing Summary Statistics Source: https://context7.com/deeptools/pybigwig/llms.txt Computes summary statistics over specified genomic intervals. ```APIDOC ## Computing Summary Statistics The `stats()` method computes summary statistics (mean, min, max, coverage, std, sum) over genomic intervals. You can divide regions into multiple bins and optionally use exact computation instead of zoom levels. ### Method `stats(chrom, start, end, type='mean', nBins=1, exact=False)` ### Parameters - **chrom** (str) - Required - The chromosome name. - **start** (int) - Required - The start position of the interval (0-based). - **end** (int) - Required - The end position of the interval (0-based, half-open). - **type** (str) - Optional - The type of statistic to compute. Options: 'mean', 'min', 'max', 'coverage', 'std', 'sum'. Defaults to 'mean'. - **nBins** (int) - Optional - The number of bins to divide the interval into. Defaults to 1. - **exact** (bool) - Optional - If `True`, computes exact statistics ignoring zoom levels. Defaults to `False`. ### Request Example ```python import pyBigWig bw = pyBigWig.open("test.bw") # Mean value over a range (default) mean_val = bw.stats("1", 0, 3) print(mean_val) # [0.2000000054637591] # Maximum value max_val = bw.stats("1", 0, 3, type="max") print(max_val) # [0.30000001192092896] # Multiple bins binned = bw.stats("1", 99, 200, type="max", nBins=2) print(binned) # [1.399999976158142, 1.5] # Exact statistics exact = bw.stats("1", 89294, 91629, exact=True) print(exact) # [0.22213841940688142] bw.close() ``` ### Response A list of floats representing the computed statistics for each bin. ``` -------------------------------- ### Check pyBigWig Feature Support Source: https://context7.com/deeptools/pybigwig/llms.txt Checks if pyBigWig was compiled with support for remote file access (curl) and NumPy arrays. This is useful for determining available functionalities before attempting to use them. ```python import pyBigWig # Check for remote file support if pyBigWig.remote == 1: print("Remote file access supported") bw = pyBigWig.open("http://example.com/data.bw") else: print("Remote file access NOT supported") # Check for NumPy support if pyBigWig.numpy == 1: print("NumPy support enabled") else: print("NumPy support NOT enabled") ```