### Verify Biopython Installation Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Getting_Started.md This snippet demonstrates how to check if Biopython has been successfully installed by attempting to import the Bio module at the Python interactive prompt. A successful import without errors indicates a correct installation. ```pycon >>> import Bio ``` -------------------------------- ### Basic Biopython Sequence Manipulation Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Getting_Started.md This example showcases fundamental operations with Biopython's Seq object. It demonstrates creating a sequence, retrieving its length, generating its reverse complement, and performing a protein translation. This highlights the core capabilities of the Seq object. ```python from Bio.Seq import Seq # create a sequence object my_seq = Seq("CATGTAGACTAG") # print out some details about it print("seq %s is %i bases long" % (my_seq, len(my_seq))) print("reverse complement is %s" % my_seq.reverse_complement()) print("protein translation is %s" % my_seq.translate()) ``` -------------------------------- ### Install and test Biopython from source distribution Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Building_a_release.md Installs the Biopython package from the extracted source distribution into a temporary prefix and then runs its test suite to ensure the distributed package is functional. Common failures relate to missing example files in `MANIFEST.in`. ```bash drevil:~tmp1/biopython-1.78/> python -m pip install . --prefix /tmp/test-install drevil:~tmp1/biopython-1.78/> cd Tests && python run_tests.py ``` -------------------------------- ### Install Biopython from local source using pip Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Download.md Installs Biopython from the current directory (source code) using pip. This is a modern and often preferred alternative to using setup.py install directly. ```bash pip install . ``` -------------------------------- ### Fasta File Format Example Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md Illustrates the typical structure of a Fasta file, showing the identifier line (starting with '>') followed by the sequence data. This format is simple and widely used for sequence representation. ```APIDOC >X55053.1 A.thaliana cor6.6 mRNA. AACAAAACACACATCAAAAACGATTTTACAAGAAAAAAATA... ... ``` -------------------------------- ### Install Biopython from source using setup.py Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Download.md Instructions for building, testing, and installing Biopython directly from its source code using the traditional setup.py script. This method requires an appropriate C compiler. ```bash python setup.py build python setup.py test python setup.py install ``` -------------------------------- ### Mocapy Python Package Installation Commands Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GSOC2011_Mocapy.md These commands outline the standard procedure for building and installing the Mocapy Python package using its setup.py script. The 'build' command compiles extension modules, while 'install' places the package in the Python environment, optionally performing the build step if not already done. ```shell python setup.py build python setup.py install ``` -------------------------------- ### Install Biopython on FreeBSD via Ports Collection Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Packages.md Install Biopython on FreeBSD using the Ports Collection. This method automatically fetches and installs Biopython and its necessary dependencies by navigating to the `biology/py-biopython` port and executing `make install clean`. ```bash cd /usr/ports/biology/py-biopython make install clean ``` -------------------------------- ### Install Biopython and Dependencies on Ubuntu or Debian Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Packages.md Install Biopython and related packages on Ubuntu or Debian using `apt-get`. Commands are provided for the core library, documentation, BioSQL components, and build dependencies. Note that these packages might not be the latest Biopython release. ```bash sudo apt-get install python-biopython ``` ```bash sudo apt-get install python-biopython-doc ``` ```bash sudo apt-get install python-biopython-sql ``` ```bash sudo apt-get build-dep python-biopython ``` -------------------------------- ### Install Perl for BioSQL Scripts on Debian/Ubuntu Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/BioSQL.md This command installs Perl on Debian or Ubuntu Linux. Perl is a prerequisite for running various BioSQL setup scripts, such as the "load_ncbi_taxonomy.pl" script, which is used for loading NCBI Taxonomy data into the database. ```bash sudo apt-get install perl ``` -------------------------------- ### Verify Git Installation Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GitUsage.md Checks if Git is correctly installed and accessible from the command line by displaying its help message. ```bash git --help ``` -------------------------------- ### Install Git Core on Ubuntu/Debian Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GitUsage.md This snippet demonstrates how to install the core Git package on Ubuntu or Debian Linux distributions using the `apt-get` command-line tool. It also suggests installing additional related packages like `gitk`, `git-gui`, and `git-doc` for enhanced functionality. ```bash sudo apt-get install git-core ``` -------------------------------- ### Upload Biopython Release to PyPI using Twine Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Building_a_release.md This snippet outlines the process of uploading the Biopython release artifacts to the Python Package Index (PyPI). It guides the user to the main repository, ensures Twine is installed, and then uses Twine to upload both the source tarball and the built wheel files to PyPI. ```bash $ cd ~/repositories/biopython/ $ pip install twine $ twine upload dist/biopython-1.78.tar.gz $ twine upload dist/biopython-1.78-*.whl ``` -------------------------------- ### Example FASTA Output from SeqRecord Formatting Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqRecord.md An example of the FASTA formatted string output generated by the `.format("fasta")` method of a `SeqRecord` object, showing a typical sequence record in FASTA format. ```fasta >Z78439.1 P.barbatum 5.8S rRNA gene and ITS1 and ITS2 DNA. CATTGTTGAGATCACATAATAATTGATCGAGTTAATCTGGAGGATCTGTTTACTTTGGTC ACCCATGGGCATTTGCTGTTGAAGTGACCTAGATTTGCCATCGAGCCTCCTTGGGAGCTT TCTTGTTGGCGAGATCTAAACCCCTGCCCGGCGGAGTTGGGCGCCAAGTCATATGACACA TAATTGGTGAAGGGGGTGGTAATCCTGCCCTGACCCTCCCCAAATTATTTTTTTAACAAC TCTCAGCAACGGATATCTCGGCTCTTGCATCGATGAAGAACGCAGCGAAATGCGATAATG GTGTGAATTGCAGAATCCCGTGAACATCGAGTCTTTGAACGCAAGTTGCGCCCGAGGCCA TCAGGCCAAGGGCACGCCTGCCTGGGCATTGCGAGTCATATCTCTCCCTTAATGAGGCTG TCCATACATACTGTTCAGCCGGTGCGGATGTGAGTTTGGCCCCTTGTTCTTTGGTACGGG GGGTCTAAGAGCTGCATGGGCTTTGGATGGTCCTAAATACGGAAAGAGGTGGACGAACTA TGCTACAACAAAATTGTTGTGCAAATGCCCCGGTTGGCCGTTTAGTTGGGCC ``` -------------------------------- ### Install Biopython using pip Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Download.md Installs the latest stable version of Biopython using the Python package manager, pip. This is the recommended and easiest installation method. ```bash pip install biopython ``` -------------------------------- ### Install Subversion on Ubuntu Linux Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Subversion_migration.md This command demonstrates how to install Subversion packages on an Ubuntu Linux system using `apt-get`. It first updates the package lists and then installs the `subversion` package, requiring superuser privileges. ```bash user@compy$ sudo apt-get update && sudo apt-get install subversion ``` -------------------------------- ### Build Biopython documentation PDF Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Building_a_release.md Installs Sphinx requirements, builds the Biopython documentation as a PDF using LaTeX, copies the generated PDF, and then cleans up temporary documentation files. ```bash drevil:~tmp1/biopython/> pip install -r .circleci/requirements-sphinx.txt drevil:~tmp1/biopython/> make -C Doc latexpdf drevil:~tmp1/biopython/> cp Doc/_build/latex/Biopython_doc.pdf Doc/ drevil:~tmp1/biopython/> make clean -C Doc ``` -------------------------------- ### Example FASTA Output for Generated Sequence Fragments Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md This snippet shows an example of the FASTA formatted output file ('mitofrags.fasta') that would be generated by the preceding Python script. It illustrates the structure of the sequence fragments with their identifiers. ```FASTA >fragment_1 TGGGCCTCATATTTATCCTATATACCATGTTCGTATGGTGGCGCGATGTTCTACGTGAAT CCACGTTCGAAGGACATCATACCAAAGTCGTACAATTAGGACCTCGATATGGTTTTATTC TGTTTATCGTATCGGAGGTTATGTTCTTTTTTGCTCTTTTTCGGGCTTCTTCTCATTCTT CTTTGGCACCTACGGTAGAG ... >fragment_500 ACCCAGTGCCGCTACCCACTTCTACTAAGGCTGAGCTTAATAGGAGCAAGAGACTTGGAG GCAACAACCAGAATGAAATATTATTTAATCGTGGAAATGCCATGTCAGGCGCACCTATCA GAATCGGAACAGACCAATTACCAGATCCACCTATCATCGCCGGCATAACCATAAAAAAGA TCATTAAAAAAGCGTGAGCC ``` -------------------------------- ### Install Biopython on Windows with full path Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Download.md Shows how to install Biopython on Windows when 'python' and 'pip' are not in the system's PATH, by specifying the full path to the pip executable. ```bash C:\Python39\Scripts\pip install biopython ``` -------------------------------- ### Install Gitk on Redhat/Fedora/Mandriva Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GitUsage.md This snippet shows how to install the `gitk` package, a graphical Git repository browser, on RPM-based Linux distributions such as Redhat, Fedora, or Mandriva using the `yum` package manager. This command is presented as a way to get Git functionality on these systems. ```bash yum install gitk ``` -------------------------------- ### GenBank File Format Example Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md Illustrates the typical header structure and content of a GenBank file, including LOCUS, DEFINITION, ACCESSION, and VERSION fields, demonstrating the format's key elements. ```APIDOC LOCUS ATCOR66M 513 bp mRNA PLN 02-MAR-1992 DEFINITION A.thaliana cor6.6 mRNA. ACCESSION X55053 VERSION X55053.1 GI:16229 ... ``` -------------------------------- ### Install Core MySQL and Python MySQLdb on Debian/Ubuntu Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/BioSQL.md This command installs the essential MySQL server components, common MySQL files, and the "python3-mysqldb" package, which provides a Python 3 interface to MySQL. This is a foundational step for setting up a BioSQL database on Debian or Ubuntu Linux. ```bash sudo apt install mysql-common mysql-server python3-mysqldb ``` -------------------------------- ### Creating a Densities Plugin Instance with Arguments Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GSOC2011_MocapyExt.md This example illustrates how to create an instance of `densities_plugin` and then obtain a `densities_adapter` by passing a variable list of arguments. These arguments are forwarded to the Python API and converted using Boost.Python's type conversion facilities, supporting up to 6 arguments. ```C++ densities_plugin p("test_module", "DensitiesClassName"); densities_adapter n = p.densities(arg1, arg2, ..., argN); ``` -------------------------------- ### Example of Multiple Alignment Format (MAF) File Structure Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Multiple_Alignment_Format.md Illustrates the basic structure of a MAF file, showing alignment blocks (`a` lines) and sequence lines (`s` lines). Each `s` line specifies sequence details like species, chromosome, start, size, strand, source size, and the aligned sequence. ```text track name=euArc visibility=pack ##maf version=1 scoring=tba.v8 # tba.v8 (((human chimp) baboon) (mouse rat)) a score=23262.0 s hg18.chr7 27578828 38 + 158545518 AAA-GGGAATGTTAACCAAATGA---ATTGTCTCTTACGGTG s panTro1.chr6 28741140 38 + 161576975 AAA-GGGAATGTTAACCAAATGA---ATTGTCTCTTACGGTG s baboon 116834 38 + 4622798 AAA-GGGAATGTTAACCAAATGA---GTTGTCTCTTATGGTG s mm4.chr6 53215344 38 + 151104725 -AATGGGAATGTTAAGCAAACGA---ATTGTCTCTCAGTGTG s rn3.chr4 81344243 40 + 187371129 -AA-GGGGATGCTAAGCCAATGAGTTGTTGTCTCTCAATGTG a score=5062.0 s hg18.chr7 27699739 6 + 158545518 TAAAGA s panTro1.chr6 28862317 6 + 161576975 TAAAGA s baboon 241163 6 + 4622798 TAAAGA s mm4.chr6 53303881 6 + 151104725 TAAAGA s rn3.chr4 81444246 6 + 187371129 taagga a score=6636.0 s hg18.chr7 27707221 13 + 158545518 gcagctgaaaaca s panTro1.chr6 28869787 13 + 161576975 gcagctgaaaaca s baboon 249182 13 + 4622798 gcagctgaaaaca s mm4.chr6 53310102 13 + 151104725 ACAGCTGAAAATA ``` -------------------------------- ### Install Biopython using pip Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Packages.md This is the generally recommended method for installing Biopython. It uses Python's package manager `pip` to fetch and install the library and its dependencies. ```bash pip install biopython ``` -------------------------------- ### Ensure pip is installed Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Download.md Runs the ensurepip module to install pip if it's not already present in your Python environment. This is a troubleshooting step if pip commands fail. ```bash python -m ensurepip ``` -------------------------------- ### Install Pre-Commit Hooks Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GitUsage.md Installs pre-commit hooks in the local Biopython repository. These hooks automate code quality checks and formatting before commits, ensuring adherence to project standards. ```bash pre-commit install ``` -------------------------------- ### Example Output of Directly Created SeqRecord Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqRecord.md The console output when printing a `SeqRecord` object that was created directly, showing its ID, Name, Description, number of features, and the sequence content. ```output ID: YP_025292.1 Name: HokC Description: toxic membrane protein, small Number of features: 0 Seq('MKQHKAMIVALIVICITAVVAALVTRKDLCEVHIRTGQTEVAVF') ``` -------------------------------- ### Update Biopython local repository Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Building_a_release.md Ensures the local Biopython repository is up-to-date with the latest changes from the master branch before starting the release process. ```bash $ cd ~/repositories/biopython $ git checkout master $ git pull origin master ``` -------------------------------- ### Install macOS command line tools for source compilation Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Download.md Installs Apple's command line developer tools, which are required for compiling Biopython from source on macOS. This command offers to install XCode, but only the command line tools are necessary. ```bash xcode-select --install ``` -------------------------------- ### Example Usage of Biopython Tree Comparison Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Phylo.md This example demonstrates how to use the `compare` function with `Bio.Phylo` to read Newick tree files and compare their topologies, showing `True` for identical topologies and `False` otherwise. ```pycon >>> tree1 = Phylo.read('Tests/TreeConstruction/upgma.tre', 'newick') >>> tree2 = Phylo.read('Tests/TreeConstruction/nj.tre', 'newick') >>> tree3 = Phylo.read('Tests/TreeConstruction/pars1.tre', 'newick') >>> compare(tree1, tree2) False >>> compare(tree1, tree3) True >>> compare(tree2, tree3) False ``` -------------------------------- ### Configure BioSQL Unit Tests Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/BioSQL.md This snippet shows the Python configuration variables in 'Tests/setup_BioSQL.py' that need to be set to run the BioSQL unit tests. It includes driver, type, host, user, password, and test database name, with specific examples for MySQL and PostgreSQL. ```python DBDRIVER = "mysql.connector" DBTYPE = "mysql" DBHOST = "localhost" DBUSER = "root" DBPASSWD = "your-password" TESTDB = "biosql_test" ``` ```python DBDRIVER = "psycopg2" DBTYPE = "pg" ``` -------------------------------- ### Check Subversion installation path on Linux Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Subversion_migration.md This snippet demonstrates how to use the `which` command in a Linux terminal to verify if Subversion (svn) is installed and to find its executable path. A successful output shows the path, indicating Subversion is available on the system. ```bash user@compy$ which svn /usr/bin/svn user@compy$ ``` -------------------------------- ### Install Python MySQL Connector Library Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/BioSQL.md This command installs "mysql-connector-python", the official MySQL driver for Python. It is crucial for Biopython to establish a connection and interact with the MySQL database. It should be executed within your Python virtual environment. ```bash pip install mysql-connector-python ``` -------------------------------- ### Protonating PDB Files using WHATIF Web Service in Biopython Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GSOC2010_Joao.md Demonstrates how to use the experimental `Bio.Struct.WWW.WHATIF` module to interact with the WHATIF web server for protein protonation. The example shows uploading a protein structure and retrieving the protonated version. Note that this service is highly experimental, may have bugs, and currently only supports proteins without water molecules. ```Python from Bio.Struct.WWW import WHATIF from Bio import Struct server = ( WHATIF.WHATIF() ) # Performs a sort of PING to the server. Gracefully exits if the servers are down. # Get the protein structure structure = Struct.read("4PTI.pdb") protein = structure.as_protein() # This excludes water molecules # Upload the structure to the WHATIF server # This should convert the structure from a Structure object to a string via tempfile and PDBIO # I was having some issues uploading structures... id = server.UploadPDB(protein) # Protonate # Returns a Structure Object / WARNING! Bug prone for now. protein_h = server.PDBasXMLwithSymwithPolarH(id) ``` -------------------------------- ### Cloning Biopython from GitHub Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Multiple_Alignment_Format.md Instructions to clone the Biopython repository using Git from the command line to get the latest version before official release. ```bash git clone git@github.com:biopython/biopython.git ``` -------------------------------- ### Convert GenBank to Fasta using Bio.SeqIO.parse and write Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md Shows a manual method for converting a GenBank file to Fasta format. It involves parsing records from the input GenBank file and then writing them to an output Fasta file. The example also prints the count of converted records. ```Python from Bio import SeqIO with open("cor6_6.gb") as input_handle, open( "cor6_6.fasta", "w" ) as output_handle: sequences = SeqIO.parse(input_handle, "genbank") count = SeqIO.write(sequences, output_handle, "fasta") print("Converted %i records" % count) ``` -------------------------------- ### Renumbering Residues in a Biopython PDB Structure Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GSOC2010_Joao.md This snippet illustrates how to renumber residues within a Biopython PDB `Structure` object using the `renumber_residues()` method. It demonstrates how the method adjusts residue numbering to start from 1 by default, while preserving information about gaps. An optional `start` argument allows users to specify a custom starting number for renumbering. ```python from Bio.PDB import PDBParser p = PDBParser() s = p.get_structure("example", "1IHM.pdb") print(list(s.get_residues())[0]) # s.renumber_residues() print(list(s.get_residues())[0]) # ``` -------------------------------- ### Constructing Bootstrap Replicate Trees in Biopython Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Phylo.md This example demonstrates how to directly construct a list of bootstrap replicate trees using the bootstrap_trees method. It requires an alignment, the number of replicates, and a tree constructor object (e.g., DistanceTreeConstructor) to build the trees. ```python calculator = DistanceCalculator("blosum62") constructor = DistanceTreeConstructor(calculator) trees = bootstrap_trees(msa, 100, constructor) ``` -------------------------------- ### Python: Get GBIF Search Hits Without Wildcard Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/BioGeography.md This example demonstrates querying GBIF for a scientific name without a wildcard, showing how the search results differ when an exact match is required versus a broader search. ```python params = {"format": "darwin", "scientificname": "Genlisea"} numhits = recs.get_numhits(params) ``` -------------------------------- ### Build Biopython Wheels using biopython-wheels Repository Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Building_a_release.md This sequence of commands guides the user through cloning the biopython-wheels repository, initializing its submodules, updating the `cibuildwheel.yml` workflow file to reference the new release's commit hash, committing the change, and pushing to trigger the automated wheel build process on GitHub Actions. ```bash $ cd ~/repositories $ git clone git@github.com:biopython/biopython-wheels.git $ cd biopython-wheels/ $ git submodule update --init $ emacs .github/workflows/cibuildwheel.yml # update git checkout line $ git commit .github/workflows/cibuildwheel.yml -m "Build Biopython 1.xx" $ git push origin master ``` -------------------------------- ### Create Biopython source distribution Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Building_a_release.md Generates the Biopython source distribution in both gztar and zip formats, which are necessary for PyPI uploads. ```bash drevil:~tmp1/biopython> python setup.py sdist --formats=gztar,zip ``` -------------------------------- ### Test Biopython Genepop EasyController Installation Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/PopGen_Genepop.md This snippet demonstrates how to test if Biopython can locate and utilize the Genepop executable. It initializes an `EasyController` with a Genepop formatted file and attempts to retrieve basic information. An `IOError: Genepop not found` indicates that the Genepop executable is not accessible. ```Python from Bio.PopGen.GenePop.EasyController import EasyController ctrl = EasyController(your_file_here) print(ctrl.get_basic_info()) ``` -------------------------------- ### Constructing Parsimony Trees with ParsimonyTreeConstructor in Biopython Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Phylo.md This example demonstrates how to use ParsimonyTreeConstructor to build a phylogenetic tree based on parsimony. It involves reading an alignment, initializing a ParsimonyScorer and a NNITreeSearcher, and then using the constructor to build the tree from the alignment. The resulting tree structure is printed. ```pycon >>> from Bio import AlignIO >>> from TreeConstruction import * >>> aln = AlignIO.read(open('Tests/TreeConstruction/msa.phy'), 'phylip') >>> starting_tree = Phylo.read('Tests/TreeConstruction/nj.tre', 'newick') >>> scorer = ParsimonyScorer() >>> searcher = NNITreeSearcher(scorer) >>> constructor = ParsimonyTreeConstructor(searcher, starting_tree) >>> pars_tree = constructor.build_tree(aln) >>> print(pars_tree) Tree(weight=1.0, rooted=True) Clade(branch_length=0.0) Clade(branch_length=0.197335, name='Inner1') Clade(branch_length=0.13691, name='Delta') Clade(branch_length=0.08531, name='Epsilon') Clade(branch_length=0.041935, name='Inner2') Clade(branch_length=0.01421, name='Inner3') Clade(branch_length=0.17523, name='Gamma') Clade(branch_length=0.07477, name='Beta') Clade(branch_length=0.29231, name='Alpha') ``` -------------------------------- ### Supported File Formats for Bio.SeqIO Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md This table lists the file formats that `Bio.SeqIO` can read, write, and index, along with notes on their specific characteristics and the Biopython version in which support was first introduced. It serves as a comprehensive reference for the module's file handling capabilities. ```APIDOC | Format name | Read | Write | Index | Notes | |-----------------------|------|-------------|-------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | abi | 1.58 | No | N/A | Reads the ABI "Sanger" capillary sequence traces files, including the PHRED quality scores for the base calls. This allows ABI to FASTQ conversion. Note each ABI file contains one and only one sequence (so there is no point in indexing the file). | | abi-trim | 1.71 | No | N/A | Same as "abi" but with quality trimming with Mott's algorithm. | | ace | 1.47 | No | 1.52 | Reads the contig sequences from an ACE assembly file. Uses Bio.Sequencing.Ace internally | ``` -------------------------------- ### Extract and navigate to Biopython source distribution Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Building_a_release.md Extracts the generated Biopython tarball into a new directory and changes into it for further testing and verification, including checking for the documentation PDF. ```bash drevil:~tmp1/biopython/> cd .. drevil:~tmp1/> tar -xzvf biopython/dist/biopython-1.78.tar.gz drevil:~tmp1/> cd biopython-1.78 ``` -------------------------------- ### Install or Update Biopython using Conda Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Packages.md For Python installations managed by Conda (e.g., Miniconda, Anaconda), Biopython can be installed or updated via the `conda` command. It's recommended to use the `conda-forge` channel for the most up-to-date versions across Windows, Mac OS X, and Linux. ```bash conda install -c conda-forge biopython ``` ```bash conda update -c conda-forge biopython ``` -------------------------------- ### Clone a clean Biopython repository for tar-ball testing Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Building_a_release.md Clones the official Biopython repository into a new directory to ensure a clean environment for building and testing the source distribution. ```bash drevil:~tmp1/> git clone https://github.com/biopython/biopython.git drevil:~tmp1/> cd biopython ``` -------------------------------- ### Biopython Format: fasta Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md Documentation for the standard FASTA file format, where each record begins with a '>' line, used for sequence input. ```APIDOC Format: fasta Description: This refers to the input FASTA file format (http://bioperl.org/formats/sequence_formats/FASTA_sequence_format) introduced for Bill Pearson's FASTA tool, where each record starts with a ">" line. Biopython Version (Read): 1.43 Biopython Version (Write): 1.43 Biopython Version (AlignIO): 1.52 ``` -------------------------------- ### Install Biopython on Gentoo Linux Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Packages.md Install Biopython on Gentoo Linux using the `emerge` command. The `sci-biology/biopython` ebuild builds the package from source, ensuring it integrates with the system's Portage tree. ```bash emerge -va biopython ``` -------------------------------- ### Install Biopython on Archlinux Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Packages.md Install Biopython on Archlinux using the `pacman` package manager. Separate commands are available for Python 2 (`python2-biopython`) and Python 3 (`python-biopython`) versions from the official repository. ```bash pacman -S python2-biopython ``` ```bash pacman -S python-biopython ``` -------------------------------- ### Clone Biopython Website Repository Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Building_a_release.md This command is used to clone the biopython.github.io repository, which hosts the official Biopython website. This step is a prerequisite for making any updates or modifications to the website content after a new release. ```bash $ cd ~/repositories $ git clone git@github.com:biopython/biopython.github.io.git ``` -------------------------------- ### Install Biopython for specific Python versions Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Download.md Demonstrates how to install Biopython using pip for a specific Python interpreter version (e.g., Python 3.13) or for PyPy, useful in environments with multiple Python installations. ```bash python3.13 -m pip install biopython ``` ```bash pypy -m pip install biopython ``` -------------------------------- ### Verify Subversion not installed on Linux Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Subversion_migration.md This snippet shows the output of the `which svn` command when Subversion is not installed on a Linux system. An empty output indicates that the `svn` executable is not found in the system's PATH, suggesting Subversion is not installed. ```bash user@compy$ which svn user@compy$ ``` -------------------------------- ### Install Biopython on Fedora Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Packages.md Install Biopython on Fedora using the `yum` package manager. Official packages are available for Python 2 (`python-biopython`) and Python 3 (`python3-biopython`). GUI package management systems can also be used. ```bash yum install python-biopython ``` ```bash yum install python3-biopython ``` -------------------------------- ### Setting and Retrieving Codeml Options Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/PAML.md This snippet demonstrates how to set individual runtime options using `set_options()` and retrieve their values using `get_option()`. The `NSsites` option is specifically highlighted as it accepts a Python list which is converted to a space-delimited string in the control file. ```python >>> cml.set_options(clock=1) >>> cml.set_options(NSsites=[0, 1, 2]) >>> cml.set_options(aaRatefile="wag.dat") >>> cml.get_option("NSsites") [0, 1, 2] ``` -------------------------------- ### Biopython Format: fastq-illumina Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md Documentation for early Solexa/Illumina style FASTQ files (pipeline 1.3 to 1.7), encoding PHRED qualities with an ASCII offset of 64, handled by Biopython. ```APIDOC Format: fastq-illumina Description: In Biopython, "fastq-illumina" refers to early Solexa/Illumina style FASTQ files (from pipeline version 1.3 to 1.7) which encode PHRED qualities using an ASCII offset of 64. For *good* quality reads, PHRED and Solexa scores are approximately equal, so the "fastq-solexa" and "fastq-illumina" variants are almost equivalent. Biopython Version (Read): 1.51 Biopython Version (Write): 1.51 Biopython Version (AlignIO): 1.52 ``` -------------------------------- ### Initializing the Biopython PAML Codeml Object Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/PAML.md This section demonstrates two methods for initializing the `Codeml` object. The first method uses the constructor to set file locations and working directory directly. The second method initializes an empty object and then sets the attributes individually. File locations are converted to relative paths to avoid PAML's string length limitations. ```python cml = codeml.Codeml( alignment="align.phylip", tree="species.tree", out_file="results.out", working_dir="./scratch" ) ``` ```python cml = codeml.Codeml() cml.alignment = "align.phylip" cml.tree = "species.tree" cml.out_file = "results.out" cml.working_dir = "./scratch" ``` -------------------------------- ### Upgrade Biopython using pip Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/Download.md Upgrades an existing Biopython installation to the latest version. This command also removes older versions of Biopython and NumPy before installing the recent ones. ```bash pip install biopython --upgrade ``` -------------------------------- ### Bio.SeqIO Supported Sequence File Formats Overview Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md This section provides an overview of various sequence file formats supported by Bio.SeqIO, detailing their names, descriptions, and Biopython versions for read/write support. It includes formats like Swiss-Prot, Tab-separated, Qual, UniProt XML, and XDNA, highlighting their characteristics and internal Biopython modules used. ```APIDOC Format: swiss Description: Swiss-Prot (aka UniProt) format. Uses Bio.SwissProt internally. See also the UniProt XML format. Read Version: 1.43 Write Version: 1.52 Format: tab Description: Simple two column tab separated sequence files, where each line holds a record's identifier and sequence. Used by Aligent's eArray software. Read Version: 1.48 Write Version: 1.48 Latest Version: 1.52 Format: qual Description: Qual files are like FASTA but record space separated integer sequencing values as PHRED quality scores. Often used as an alternative to FASTQ. Read Version: 1.50 Write Version: 1.50 Latest Version: 1.52 Format: uniprot-xml Description: UniProt XML format, successor to the plain text Swiss-Prot format. Read Version: 1.56 Write Version: 1.56 Latest Version: 1.56 Format: xdna Description: The native format used by Christian Marck's DNA Strider and Serial Cloner. Read Version: 1.75 Write Version: 1.75 ``` -------------------------------- ### Biopython Format: fastq-solexa Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md Documentation for the original Solexa/Illumina style FASTQ files, encoding Solexa qualities with an ASCII offset of 64, as handled by Biopython. ```APIDOC Format: fastq-solexa Description: In Biopython, "fastq-solexa" refers to the original Solexa/Illumina style FASTQ files which encode Solexa qualities using an ASCII offset of 64. See also what we call the "fastq-illumina" format. Biopython Version (Read): 1.50 Biopython Version (Write): 1.50 Biopython Version (AlignIO): 1.52 ``` -------------------------------- ### Download Entire PDB Database (Biopython PDBList) Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/The_Biopython_Structural_Bioinformatics_FAQ.md Describes how to download the complete Protein Data Bank using the PDBList command-line tool or the 'download_entire_pdb' API method. Users can choose to store all files in a single directory or have them sorted into PDB-style subdirectories based on their IDs. ```bash python PDBList.py all /data/pdb ``` ```bash python PDBList.py all /data/pdb -d ``` ```APIDOC PDBList Class: download_entire_pdb(pdir: str, divided: bool = True) pdir: The directory to store all PDB files. divided: If True, files are sorted into subdirectories; otherwise, all files are in one directory. ``` -------------------------------- ### Codeml Object run() Method API Documentation Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/PAML.md Details the `run()` method's functionality, parameters, return values, and error handling for executing the *codeml* program. ```APIDOC Method: run() Description: Executes *codeml* with the current options and returns the parsed results in a dictionary object. Optional Arguments: verbose (boolean): Description: *codeml*'s screen output is suppressed by default. Set to `True` to see all output as it is generated, useful for debugging error messages. Default: False parse (boolean): Description: Set to `False` to skip parsing the results. If `False`, `run()` will return `None`. Default: True ctl_file (string): Description: Provide a path to an existing control file to execute. If set to `None` (default), the options dictionary is written to a control file, which is then used by *codeml*. Default: None command (string): Description: Provide a path to the *codeml* executable. Defaults to "codeml". If the program is not in your system path or if you use multiple versions of PAML, provide the full path. Default: "codeml" Returns: dictionary object: Parsed results (if `parse` is `True`). None: If `parse` is `False`. Raises: PamlError: If the *codeml* process exits with an error. ``` -------------------------------- ### SnapGene Native Format Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md The native format used by SnapGene. ```APIDOC Format: snapgene Description: The native format used by SnapGene. Biopython Support: Read Version: 1.75 Write Version: No Append Version: No ``` -------------------------------- ### Example phyloXML Document Structure Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/PhyloXML.md An example of a phyloXML document, showing the root `phyloxml` element, `phylogeny`, and nested `clade` elements with attributes like `branch_length` and `name`. This XML structure is directly mirrored in the Biopython `Phyloxml` objects. ```xml An example A B C ``` -------------------------------- ### Install Pre-Commit Hook Tool for Style Checks Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GitUsage.md This command installs the 'pre-commit' tool via pip. This tool manages Git pre-commit hooks, which automate code style checks before each commit, ensuring compliance with Biopython's coding conventions like PEP8 and PEP257. ```bash pip install pre-commit ``` -------------------------------- ### Create MySQL Database for BioSQL Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/BioSQL.md This command uses `mysqladmin` to create a new database named `bioseqdb` under the `root` user account. It's the initial step for setting up the BioSQL environment. ```bash mysqladmin -u root -p create biosqldb ``` -------------------------------- ### Remove Disordered Atoms from Biopython Structures Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GSOC2010_Joao.md The `remove_disordered_atoms` method, implemented in `Structure.py`, allows for the removal of `DisorderedAtom` objects from a residue. It replaces them with a single `Atom` object at a user-specified location (`keep_loc`), which defaults to 'A'. This example demonstrates its usage and the verbose output indicating modified residues. ```Python >>> s = s.remove_disordered_atoms(verbose=True) 0 residues were modified >>> # Now if we load a structure with disordered atoms >>> ds = Struct.read('1MC2.pdb') >>> ds.remove_disordered_atoms(verbose=True) Residue TRP:1010 has 8 disordered atoms: CD1/CD2/NE1/CE2/CE3/CZ2/CZ3/CH2 Residue VAL:1018 has 3 disordered atoms: CB/CG1/CG2 Residue LEU:1024 has 4 disordered atoms: CB/CG/CD1/CD2 Residue ARG:1043 has 7 disordered atoms: CB/CG/CD/NE/CZ/NH1/NH2 Residue MET:1092 has 4 disordered atoms: CB/CG/SD/CE Residue ARG:1107 has 7 disordered atoms: CB/CG/CD/NE/CZ/NH1/NH2 Residue GLU:1108 has 4 disordered atoms: CG/CD/OE1/OE2 Residue ASP:1111 has 4 disordered atoms: CB/CG/OD1/OD2 Residue SER:1116 has 1 disordered atoms: OG Residue SER:1131 has 1 disordered atoms: O 10 residues were modified ``` -------------------------------- ### Parse MODELLER PIR Format with Biopython SeqIO Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GSOC2010_Joao.md Biopython's `SeqIO` module now supports reading the MODELLER PIR format, identified as 'pir-modeller'. This feature currently allows reading but not writing. The example demonstrates the structure of a MODELLER PIR file and how to parse it using `SeqIO.parse`. ```PIR Format >P1;5fd1 structureX:5fd1:1 :A:106 :A:ferredoxin:Azotobacter vinelandii: 1.90: 0.19 AFVVTDNCIKCKYTDCVEVCPVDCFYEGPNFLVIHPDECIDCALCEPECPAQAIFSEDEVPEDMQEFIQLNAELA EVWPNITEKKDPLPDAEDWDGVKGKLQHLER* ``` ```Python >>> from Bio import SeqIO >>> for i in SeqIO.parse("test_pir.txt", "pir-modeller"): ... print(i) ... ID: 5fd1 Name: 5fd1 Description: ferredoxin Number of features: 0 /r_factor= 0.19 /end_residue=106 /initial_chain=a /end_chain=a /record_type=X-Ray Structure /initial_residue=1 /resolution= 1.90 /source_organism=Azotobacter vinelandii Seq('AFVVTDNCIKCKYTDCVEVCPVDCFYEGPNFLVIHPDECIDCALCEPECPAQAI...LER') ``` -------------------------------- ### Creating a Biopython SeqRecord Object Directly Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqRecord.md Demonstrates how to instantiate a `SeqRecord` object directly from `Bio.Seq` and `Bio.SeqRecord` classes, providing a sequence, ID, name, and description. This is useful for programmatic generation of records without parsing from a file. ```python from Bio.Seq import Seq from Bio.SeqRecord import SeqRecord record = SeqRecord( Seq("MKQHKAMIVALIVICITAVVAALVTRKDLCEVHIRTGQTEVAVF"), id="YP_025292.1", name="HokC", description="toxic membrane protein, small" ) print(record) ``` -------------------------------- ### Convert File Formats Concisely with Bio.SeqIO.convert Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/SeqIO.md Introduces the Bio.SeqIO.convert() function (available in Biopython 1.52 or later) as a more concise and direct way to convert sequence files between supported formats. This function streamlines the process of reading from one format and writing to another. ```Python from Bio import SeqIO count = SeqIO.convert("cor6_6.gb", "genbank", "cor6_6.fasta", "fasta") print("Converted %i records" % count) ``` -------------------------------- ### Example Output: GFF Feature Attribute Summary (Bash) Source: https://github.com/biopython/biopython.github.io/blob/master/wiki/GFF_Parsing.md This Bash snippet displays an example of the dictionary output from `GFFExaminer.available_limits`. It provides a detailed breakdown of counts for different `gff_id`, `gff_source`, `gff_source_type`, and `gff_type` values present in a GFF file, offering insights into the file's content and common patterns. ```bash {'gff_id': {('I',): 159, ('II',): 3, ('III',): 2, ('IV',): 5, ('V',): 2, ('X',): 6}, 'gff_source': {('Allele',): 1, ('Coding_transcript',): 102, ('Expr_profile',): 1, ('GenePair_STS',): 8, ('Oligo_set',): 1, ('Orfeome',): 8, ('Promoterome',): 5, ('SAGE_tag',): 1, ('SAGE_tag_most_three_prime',): 1, ('SAGE_tag_unambiguously_mapped',): 12, ('history',): 30, ('mass_spec_genome',): 7}, 'gff_source_type': {('Allele', 'SNP'): 1, ('Coding_transcript', 'CDS'): 27, ('Coding_transcript', 'exon'): 33, ('Coding_transcript', 'five_prime_UTR'): 4, ('Coding_transcript', 'gene'): 2, ('Coding_transcript', 'intron'): 29, ('Coding_transcript', 'mRNA'): 4, ('Coding_transcript', 'three_prime_UTR'): 3, ('Expr_profile', 'experimental_result_region'): 1, ('GenePair_STS', 'PCR_product'): 8, ('Oligo_set', 'reagent'): 1, ('Orfeome', 'PCR_product'): 8, ('Promoterome', 'PCR_product'): 5, ('SAGE_tag', 'SAGE_tag'): 1, ('SAGE_tag_most_three_prime', 'SAGE_tag'): 1, ('SAGE_tag_unambiguously_mapped', 'SAGE_tag'): 12, ('history', 'CDS'): 30, ('mass_spec_genome', 'translated_nucleotide_match'): 7}, 'gff_type': {('CDS',): 57, ('PCR_product',): 21, ('SAGE_tag',): 14, ('SNP',): 1, ('exon',): 33, ('experimental_result_region',): 1, ('five_prime_UTR',): 4, ('gene',): 2, ('intron',): 29, ('mRNA',): 4, ('reagent',): 1, ('three_prime_UTR',): 3, ('translated_nucleotide_match',): 7}} ```