### Install PPanGGOLiN from Source Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/install.md Installs the package from the current directory using pip. ```bash pip install . ``` -------------------------------- ### Install Documentation Dependencies Source: https://github.com/labgem/ppanggolin/blob/master/docs/dev/buildDoc.md Installs the necessary packages for building documentation from the source directory. ```shell # Replace '/path/to/ppanggolin/' with your actual path pip install /path/to/ppanggolin/[doc] ``` -------------------------------- ### Install PPanGGOLiN via PyPI Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/install.md Installs PPanGGOLiN using pip. Note that non-Python dependencies must be installed separately for full functionality. ```bash pip install ppanggolin # Verify the installation ppanggolin --version ``` -------------------------------- ### Initialize Sphinx Documentation Source: https://github.com/labgem/ppanggolin/blob/master/docs/dev/buildDoc.md Sets up a new documentation directory structure using the Sphinx quickstart utility. ```shell DOCS=path/to/PPanGGOLiN/docs sphinx-quickstart $DOCS ``` -------------------------------- ### Activate Environment and Check Version Source: https://github.com/labgem/ppanggolin/blob/master/README.md After installation, activate the created conda environment and run this command to verify that PPanGGOLiN is installed correctly and to check its version. ```bash conda activate ppanggolin ppanggolin --version ``` -------------------------------- ### Run Sphinx Autobuild Source: https://github.com/labgem/ppanggolin/blob/master/docs/dev/buildDoc.md Starts a local server to visualize documentation changes in real-time. ```shell cd $PPanGGOLiN/docs sphinx-autobuild . build/ # Copy the server address, for example: http://127.0.0.1:8000 # Paste the address in your browser ``` -------------------------------- ### PPanGGOLiN Configuration File Example Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/practicalInformation.md This YAML structure represents a PPanGGOLiN configuration file. It includes sections for input parameters, general parameters, and specific parameters for the 'annotate' subcommand. Parameters can be uncommented and modified to customize behavior. ```yaml input_parameters: # A tab-separated file listing the genome names, and the fasta filepath of its # genomic sequence(s) (the fastas can be compressed with gzip). One line per genome. # fasta: # A tab-separated file listing the genome names, and the gff/gbff filepath of # its annotations (the files can be compressed with gzip). One line # per genome. If this is provided, those annotations will be used. # anno: general_parameters: # Output directory output: ppanggolin_output_DATE2023-04-14_HOUR10.09.27_PID14968 # basename for the output file basename: pangenome # directory for storing temporary files tmpdir: /tmp # Indicate verbose level (0 for warning and errors only, 1 for info, 2 for debug) # Choices: 0, 1, 2 verbose: 1 # log output file log: stdout # disables the progress bars disable_prog_bar: False # Force writing in output directory and in pangenome output file. force: False annotate: # Use to not remove genes overlapping with RNA features. allow_overlap: False # Use to avoid annotating RNA features. norna: False # Kingdom to which the prokaryota belongs to, to know which models to use for rRNA annotation. # Choices: bacteria, archaea kingdom: bacteria # Translation table (genetic code) to use. translation_table: 11 # In the context of provided annotation, use this option to read pseudogenes. (Default behavior is to ignore them) use_pseudo: False # Allow to force the prodigal procedure. If nothing given, PPanGGOLiN will decide in function of contig length # Choices: single, meta prodigal_procedure: False # Number of available cpus cpu: 1 ``` -------------------------------- ### Install PPanGGOLiN in editable mode Source: https://github.com/labgem/ppanggolin/blob/master/docs/dev/contribute.md Use this command to install the package in editable mode, allowing local code modifications to take effect immediately. ```bash pip install -e . ``` -------------------------------- ### Install Dependencies via Conda Environment File Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/install.md Uses the provided YAML file to set up the environment for source installation. ```bash conda env create -n ppanggolin_source -f ppanggolin_env.yaml ``` -------------------------------- ### Install PPanGGOLiN with Conda Source: https://github.com/labgem/ppanggolin/blob/master/README.md Use this command to install PPanGGOLiN into a new conda environment. It's recommended to create a dedicated environment to avoid dependency conflicts. ```bash conda create -n ppanggolin -c conda-forge -c bioconda ppanggolin ``` -------------------------------- ### Install System Dependencies on Debian-based Systems Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/install.md Installs required non-Python software packages using apt. ```bash sudo apt install mmseqs2 infernal aragorn mafft ``` -------------------------------- ### Display Pangenome File Information Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeInfo.md Use the 'info' command to get comprehensive insights into a pangenome file. When no specific flag is provided, it displays all available information. ```bash ppanggolin info -p pangenome.h5 ``` -------------------------------- ### Example GFF file content Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/writeGenomes.md Sample output showing the structure of a GFF file with pangenome-specific attributes. ```gff ##gff-version 3 ##sequence-region NC_010401.1 1 5644 ##sequence-region NC_010402.1 1 9661 ##sequence-region NC_010403.1 1 2726 ##sequence-region NC_010404.1 1 94413 ##sequence-region NC_010410.1 1 3936291 NC_010401.1 . region 1 5644 . + . ID=NC_010401.1;Is_circular=true NC_010401.1 ppanggolin region 629 5591 . . . Name=NC_010401.1_RGP_0;spot=No_spot;Note=Region of Genomic Plasticity (RGP) NC_010401.1 external gene 629 1579 . + . ID=gene-ABAYE_RS00005 NC_010401.1 external CDS 629 1579 . + 0 ID=ABAYE_RS00005;Parent=gene-ABAYE_RS00005;product=replication initiation protein;family=ABAYE_RS00005;partition=cloud;rgp=NC_010401.1_RGP_0 NC_010401.1 external gene 1576 1863 . + . ID=gene-ABAYE_RS00010 NC_010401.1 external CDS 1576 1863 . + 0 ID=ABAYE_RS00010;Parent=gene-ABAYE_RS00010;product=hypothetical protein;family=ABAYE_RS00010;partition=cloud;rgp=NC_010401.1_RGP_0 NC_010401.1 external gene 2054 2572 . - . ID=gene-ABAYE_RS00015 NC_010401.1 external CDS 2054 2572 . - 0 ID=ABAYE_RS00015;Parent=gene-ABAYE_RS00015;product=tetratricopeptide repeat protein;family=HTZ92_RS18670;partition=shell;rgp=NC_010401.1_RGP_0 ``` -------------------------------- ### Module information output format Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/Modules/moduleOutputs.md Example structure of the statistical output generated by the info command. ```yaml [...] Modules: Number_of_modules: 380 Families_in_Modules: 2242 Partition_composition: Persistent: 0.27 Shell: 37.69 Cloud: 62.04 Number_of_Families_per_Modules: min: 3 max: 65 sd: 5.84 mean: 5.9 ``` -------------------------------- ### GFF Output with Metadata Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/writeGenomes.md Example of how metadata appears within the attributes column of a GFF file. ```gff NC_010404.1 external CDS 77317 77958 . - 0 ID=ABAYE_RS00475;Parent=gene-ABAYE_RS00475;product=putative metallopeptidase;family=DYB08_RS16060;partition=persistent;rgp=NC_010404.1_RGP_0;family_pfam_accession=PF18894;family_pfam_description=This entry represents a probable metallopeptidase domain found in a variety of phage and bacterial proteomes.;family_pfam_type=domain ``` -------------------------------- ### Download Genomes using genome_updater.sh Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/QuickUsage/quickWorkflow.md Example command to download genomes from NCBI RefSeq for a specific species using the genome_updater.sh script. This is a preparatory step for pangenome analysis. ```bash genome_updater.sh -d "refseq" -o "B_japonicum_genomes" -M "gtdb" -T "s__Bradyrhizobium japonicum" ``` -------------------------------- ### Run PPanGGOLiN annotation with annotation files Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeAnnotation.md Initiate the annotation process using pre-existing annotation files (gff3, .gbk, .gbff) by providing a list file. The format is similar to the FASTA list file. ```bash ppanggolin annotate --anno genomes.gbff.list ``` -------------------------------- ### Run PPanGGOLiN Workflow with Fasta Files Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeWorkflow.md Execute the pangenome workflow using fasta files. The input list should contain paths to fasta files. ```bash ppanggolin workflow --fasta genomes.fasta.list ``` -------------------------------- ### Run PPanGGOLiN annotation with FASTA files Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeAnnotation.md Use this command to initiate the annotation process when providing a list of FASTA files. Ensure the 'genomes.fasta.list' file is correctly formatted. ```bash ppanggolin annotate --fasta genomes.fasta.list ``` -------------------------------- ### Build HTML Documentation Source: https://github.com/labgem/ppanggolin/blob/master/docs/dev/buildDoc.md Commands to generate the HTML documentation using Sphinx or the provided Makefile. ```bash # Replace '/path/to/ppanggolin/' with your actual path cd /path/to/ppanggolin/docs/ sphinx-build -b html . build/ ``` ```bash # Replace '/path/to/ppanggolin/' with your actual path cd /path/to/ppanggolin/docs/ make html ``` -------------------------------- ### Convert RST to Markdown Source: https://github.com/labgem/ppanggolin/blob/master/docs/dev/buildDoc.md Installs and uses rst-to-myst to convert reStructuredText files to Markdown format. ```shell pip install rst-to-myst rst2myst convert index.rst # remove rst file(s) rm index.rst ``` -------------------------------- ### Define Metadata for Gene Families Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/writeGenomes.md Example TSV format for associating metadata with gene families. ```tsv families accession type description DYB08_RS16060 PF18894 domain This entry represents a probable metallopeptidase domain found in a variety of phage and bacterial proteomes. ``` -------------------------------- ### Combine FASTA and annotation file inputs Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeAnnotation.md Use both '--anno' and '--fasta' options simultaneously if your annotation files lack sequence data. This allows PPanGGOLiN to obtain both gene annotations and sequences. ```bash ppanggolin annotate --anno genomes.gbff.list --fasta genomes.fasta.list ``` -------------------------------- ### Display PPanGGOLiN Analysis Parameters Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeInfo.md Use the '--parameters' option to view the PPanGGOLiN parameters used during the pangenome analysis. This output can serve as a configuration file for replicating analyses. ```bash ppanggolin info --parameters -p pangenome.h5 ``` -------------------------------- ### Clone PPanGGOLiN Repository Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/install.md Standard commands to clone the repository from GitHub. ```bash git clone https://github.com/labgem/PPanGGOLiN.git cd PPanGGOLiN ``` -------------------------------- ### Run Basic PPanGGOLiN Workflow Source: https://context7.com/labgem/ppanggolin/llms.txt Performs core pangenome analysis steps: annotation, clustering, graph construction, and partitioning. This command excludes RGP and module detection. Supports annotation or FASTA files and allows customization of output directory, CPU usage, and base filename. ```bash # Basic workflow with annotation files ppanggolin workflow --anno genomes.gbff.list ``` ```bash # Basic workflow with FASTA files ppanggolin workflow --fasta genomes.fasta.list ``` ```bash # With custom output directory and parameters ppanggolin workflow --anno genomes.gbff.list \ --output pangenome_results \ --cpu 4 \ --basename my_pangenome ``` -------------------------------- ### Minimal clustering TSV file format Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeCluster.md Example of the required TSV format where the first column is the family name and the second is the gene identifier. ```text Family_A Gene_1 Family_A Gene_2 Family_A Gene_3 Family_B Gene_4 Family_B Gene_5 Family_C Gene_6 ``` -------------------------------- ### Run PPanGGOLiN Workflow with FASTA Files Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/QuickUsage/quickWorkflow.md Execute the complete PPanGGOLiN workflow using a list of FASTA files. The input file specifies genome names, FASTA file paths, and circular contig identifiers. ```bash ppanggolin all --fasta genomes.fasta.list ``` -------------------------------- ### Mark fragmented genes in clustering file Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeCluster.md Examples showing the placement of the 'F' flag for fragmented genes depending on the presence of a representative gene column. ```text Family_A Gene_1 Family_A Gene_2 Family_A Gene_3 F Family_B Gene_4 Family_B Gene_5 Family_C Gene_6 F ``` ```text Family_A Gene_1 Gene_2 Family_A Gene_2 Gene_2 Family_A Gene_3 Gene_2 F Family_B Gene_4 Gene_4 Family_B Gene_5 Gene_4 Family_C Gene_6 Gene_6 F ``` -------------------------------- ### ppanggolin.mod Package Overview Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.mod.md Documentation for the ppanggolin.mod package and the ppanggolin.mod.module submodule. ```APIDOC ## ppanggolin.mod Package ### Description The ppanggolin.mod package provides core functionality for managing modules within the ppanggolin framework. It includes the ppanggolin.mod.module submodule. ### Submodules - **ppanggolin.mod.module**: Contains logic related to module definitions and inheritance. ### Module Contents - **ppanggolin.mod**: Main package entry point providing access to module management utilities. ``` -------------------------------- ### Create Spots TSV with ppanggolin Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/RGP/rgpOutputs.md Use this command to generate the `spots.tsv` file, which links genomic spots to RGPs. Ensure you have a pangenome file and specify an output directory. ```bash ppanggolin write_pangenome -p pangenome.h5 --spots -o rgp_outputs ``` -------------------------------- ### Run complete pangenome analysis with annotations Source: https://github.com/labgem/ppanggolin/blob/master/README.md Executes the full PPanGGOLiN workflow using pre-computed GFF or GBFF/GBK annotation files. ```bash ppanggolin all --anno GENOMES_ANNOTATION_LIST ``` -------------------------------- ### Run the full test suite Source: https://github.com/labgem/ppanggolin/blob/master/docs/dev/contribute.md Executes all unit and functional tests using 12 CPUs and displays verbose output. ```bash pytest --full -v --cpu 12 ``` -------------------------------- ### ppanggolin.meta.meta module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.meta.md Documentation for the ppanggolin.meta.meta submodule. ```APIDOC ## ppanggolin.meta.meta module ### Description Details the members, undocumented members, and inheritance of the ppanggolin.meta.meta module. ### Method N/A ### Endpoint N/A ### Parameters N/A ### Request Example N/A ### Response #### Success Response (200) N/A #### Response Example N/A ``` -------------------------------- ### Run complete pangenome analysis Source: https://github.com/labgem/ppanggolin/blob/master/README.md Executes the full PPanGGOLiN workflow using a list of FASTA files. Requires a TSV-formatted input file containing genome names and file paths. ```bash ppanggolin all --fasta GENOMES_FASTA_LIST ``` -------------------------------- ### Run PPanGGOLiN Workflow with Annotation Files Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/QuickUsage/quickWorkflow.md Execute the complete PPanGGOLiN workflow using a list of annotation files (GFF or GBFF/GBK). Ensure annotation files include genomic DNA sequences. ```bash ppanggolin all --anno genomes.gbff.list ``` -------------------------------- ### Run Complete PPanGGOLiN Workflow Source: https://context7.com/labgem/ppanggolin/llms.txt Executes the full pangenome analysis pipeline, including annotation, clustering, graph construction, partitioning, RGP detection, and module detection. Use with annotation files (GFF/GBFF) or FASTA files. Custom parameters for CPU, output directory, identity, and coverage can be specified. ```bash # Run complete workflow with annotation files (GFF/GBFF) ppanggolin all --anno genomes.gbff.list ``` ```bash # Run complete workflow with FASTA files ppanggolin all --fasta genomes.fasta.list ``` ```bash # With custom parameters ppanggolin all --anno genomes.gbff.list \ --cpu 8 \ --output my_pangenome_output \ --identity 0.8 \ --coverage 0.8 ``` ```bash # Example genomes.gbff.list format (tab-separated): # genome_name1 /path/to/genome1.gbff # genome_name2 /path/to/genome2.gbff # genome_name3 /path/to/genome3.gbk ``` ```bash # Example genomes.fasta.list format (tab-separated, with optional circular contigs): # genome_name1 /path/to/genome1.fasta circular_contig_id # genome_name2 /path/to/genome2.fasta ``` -------------------------------- ### Run PanModule Workflow with Annotation Files Source: https://context7.com/labgem/ppanggolin/llms.txt Run the PanModule workflow using annotation files (GBFF) as input. Allows specifying an output directory and CPU usage. ```bash ppanggolin panmodule --anno genomes.gbff.list \ --output panmodule_results \ --cpu 4 ``` -------------------------------- ### Generate Visualizations Source: https://context7.com/labgem/ppanggolin/llms.txt Creates plots for pangenome exploration. Options include U-curves, tile plots, and spot visualizations. ```bash ppanggolin draw -p pangenome.h5 \ --output figures \ --ucurve ``` ```bash ppanggolin draw -p pangenome.h5 \ --output figures \ --tile_plot ``` ```bash ppanggolin draw -p pangenome.h5 \ --output figures \ --tile_plot \ --nocloud ``` ```bash ppanggolin draw -p pangenome.h5 \ --output figures \ --tile_plot \ --add_dendrogram ``` ```bash ppanggolin draw -p pangenome.h5 \ --output figures \ --spots all ``` ```bash ppanggolin draw -p pangenome.h5 \ --output figures \ --ucurve \ --tile_plot ``` -------------------------------- ### Generate Default Configuration File Template Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/practicalInformation.md Use this command to generate a template configuration file with default values for a specified command. This is useful for understanding available parameters and setting up custom configurations. ```bash ppanggolin utils --default_config CMD ``` ```bash ppanggolin utils --default_config panrgp ``` -------------------------------- ### Generate API Doc Files with sphinx-apidoc Source: https://github.com/labgem/ppanggolin/blob/master/docs/dev/buildDoc.md Use this command to generate the initial API documentation files in RST format. It creates an 'api' folder with the necessary structure. ```bash # Generate API doc files sphinx-apidoc -o api $PPanGGOLiN/ppanggolin ``` -------------------------------- ### Display Pangenome Information Source: https://context7.com/labgem/ppanggolin/llms.txt Use the `info` command to display detailed statistics and computed analyses of a pangenome file. Specific details like status, parameters, or content can be retrieved using flags. ```bash # Display pangenome information ppanggolin info -p pangenome.h5 ``` ```bash # Get specific information ppanggolin info -p pangenome.h5 --status ``` ```bash ppanggolin info -p pangenome.h5 --parameters ``` ```bash ppanggolin info -p pangenome.h5 --content ``` -------------------------------- ### ppanggolin.meta module contents Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.meta.md Documentation for the overall ppanggolin.meta module. ```APIDOC ## ppanggolin.meta module contents ### Description Details the members, undocumented members, and inheritance of the ppanggolin.meta module. ### Method N/A ### Endpoint N/A ### Parameters N/A ### Request Example N/A ### Response #### Success Response (200) N/A #### Response Example N/A ``` -------------------------------- ### Run panRGP with annotation or fasta files Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/RGP/rgpPrediction.md Executes the full panRGP workflow using either GFF3/GBFF annotation files or FASTA files as input. ```bash ppanggolin panrgp --anno genomes.gbff.list ``` ```bash ppanggolin panrgp --fasta genomes.fasta.list ``` -------------------------------- ### Run default MSA Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/MSA.md Executes the msa command on a pangenome file, defaulting to the 'core' partition. ```bash ppanggolin msa -p pangenome.h5 ``` -------------------------------- ### Generate Partition Files with ppanggolin Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeStat.md Use the 'write_pangenome' subcommand with the '--partitions' flag to generate partition files. These files list gene family identifiers for each partition. ```bash ppanggolin write_pangenome -p pangenome.h5 --partitions ``` -------------------------------- ### ppanggolin.graph Package Contents Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.graph.md General documentation for the ppanggolin.graph package contents. ```APIDOC ## ppanggolin.graph ### Description Core package for graph-related operations in ppanggolin. This module serves as the base for graph management and analysis tools. ``` -------------------------------- ### ppanggolin.projection.projection module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.projection.md Documentation for the ppanggolin.projection.projection module, listing its members, undocumented members, and inheritance. ```APIDOC ## ppanggolin.projection.projection module ### Description Documentation for the ppanggolin.projection.projection module. ### Members (Members are listed by the automodule directive) ### Undocumented Members (Undocumented members are listed by the automodule directive) ### Inheritance (Inheritance information is listed by the automodule directive) ``` -------------------------------- ### ppanggolin.graph.makeGraph Module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.graph.md Documentation for the makeGraph module which handles the construction of graphs within the ppanggolin framework. ```APIDOC ## ppanggolin.graph.makeGraph ### Description This module provides functionality to construct graphs for the ppanggolin project. It includes members and inheritance structures for graph generation. ### Module Contents - **makeGraph**: Contains functions and classes for graph creation and manipulation. ``` -------------------------------- ### Draw All Spots in Pangenome Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/RGP/rgpOutputs.md The --draw_spots option with --spots all generates an interactive HTML figure and a GEXF graph file for all spots in the pangenome. The GEXF file represents a subgraph of the pangenome for each spot. ```bash ppanggolin draw -p pangenome.h5 --draw_spots --spots all ``` -------------------------------- ### Annotate Genomes with PPanGGOLiN Source: https://context7.com/labgem/ppanggolin/llms.txt Prepares the initial pangenome structure by predicting genes. Supports FASTA files (using Prodigal) or pre-annotated GFF/GBFF files. Options include specifying output directory, CPU cores, genome kingdom, translation table, pseudogene inclusion, and meta mode for fragmented genomes. ```bash # Annotate from FASTA files (uses Prodigal for gene prediction) ppanggolin annotate --fasta genomes.fasta.list \ --output annotated_output \ --cpu 4 \ --kingdom bacteria \ --translation_table 11 ``` ```bash # Annotate from pre-existing annotation files ppanggolin annotate --anno genomes.gbff.list \ --output annotated_output \ --cpu 4 ``` ```bash # Include pseudogenes from annotation files ppanggolin annotate --anno genomes.gbff.list \ --use_pseudo \ --output annotated_output ``` ```bash # Use meta mode for fragmented genomes (MAGs) ppanggolin annotate --fasta genomes.fasta.list \ --prodigal_procedure meta \ --output mag_annotated ``` -------------------------------- ### ppanggolin.nem.partition Module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.nem.md Details the members, undocumented members, and inheritance of the ppanggolin.nem.partition module. ```APIDOC ## ppanggolin.nem.partition Module ### Description This section details the members, undocumented members, and inheritance of the `ppanggolin.nem.partition` module. ### Method N/A ### Endpoint N/A ### Parameters N/A ### Request Example N/A ### Response N/A ``` -------------------------------- ### Run PanRGP Workflow with FASTA Input Source: https://context7.com/labgem/ppanggolin/llms.txt Run the PanRGP workflow using FASTA files as input. Allows specifying CPU usage and an output directory for results. ```bash ppanggolin panrgp --fasta genomes.fasta.list \ --cpu 8 \ --output panrgp_results ``` -------------------------------- ### Partitioning a Pangenome Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomePartition.md Executes the partitioning process on a specified pangenome HDF5 file. ```bash ppanggolin partition -p pangenome.h5 ``` -------------------------------- ### ppanggolin.cluster module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.cluster.md Documentation for the ppanggolin.cluster module and its submodules. ```APIDOC ## ppanggolin.cluster module ### Description The ppanggolin.cluster module provides the core functionality for clustering in the ppanggolin project. It includes the ppanggolin.cluster.cluster submodule. ### Module Contents - **ppanggolin.cluster.cluster**: Contains clustering logic and implementations. ``` -------------------------------- ### ppanggolin.metrics.metrics Module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.metrics.md API documentation for the metrics submodule, detailing its members, undocumented members, and inheritance. ```APIDOC ## ppanggolin.metrics.metrics Module ### Description API documentation for the metrics submodule. ### Members (Details of members, undocumented members, and inheritance would be listed here if available in the source.) ``` -------------------------------- ### ppanggolin.metrics.fluidity Module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.metrics.md API documentation for the fluidity submodule, detailing its members, undocumented members, and inheritance. ```APIDOC ## ppanggolin.metrics.fluidity Module ### Description API documentation for the fluidity submodule. ### Members (Details of members, undocumented members, and inheritance would be listed here if available in the source.) ``` -------------------------------- ### Retrieve module statistics Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/Modules/moduleOutputs.md Execute the info command to display summary statistics for predicted modules from a pangenome file. ```bash ppanggolin info -p pangenome.h5 --content ``` -------------------------------- ### Basic Alignment Command Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/align.md Use this command to align sequences to a pangenome and generate default output files. Ensure you have a pangenome file (.h5) and a FASTA file of sequences to align. ```bash ppanggolin align -p pangenome.h5 -o MYOUTPUTDIR --sequences MY_SEQUENCSE_OF_INTEREST.fasta ``` -------------------------------- ### Generate Proksee JSON map files Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/writeGenomes.md Creates JSON files compatible with the Proksee visualization tool. ```bash ppanggolin write_genomes -p pangenome.h5 --proksee -o output ``` -------------------------------- ### Run PPanGGOLiN Workflow with Custom Clusters Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeWorkflow.md This command allows you to integrate your own gene clustering results into the workflow. Provide the annotation files and the path to your custom clusters TSV file. ```bash ppanggolin workflow --anno genomes.gbff.list --clusters clusters.tsv ``` -------------------------------- ### Execute the panModule workflow Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/Modules/modulePrediction.md Runs the complete pangenome generation and module prediction pipeline from a list of genome files. ```bash ppanggolin panmodule --fasta GENOME_LIST_FILE ``` -------------------------------- ### ppanggolin.projection module contents Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.projection.md Documentation for the contents of the ppanggolin.projection module, listing its members, undocumented members, and inheritance. ```APIDOC ## ppanggolin.projection module contents ### Description Documentation for the contents of the ppanggolin.projection module. ### Members (Members are listed by the automodule directive) ### Undocumented Members (Undocumented members are listed by the automodule directive) ### Inheritance (Inheritance information is listed by the automodule directive) ``` -------------------------------- ### Alignment with Detailed RGP and Spot Information Source: https://context7.com/labgem/ppanggolin/llms.txt Perform sequence alignment and retrieve detailed RGP and spot information for the aligned sequences. Use the --getinfo flag for this extended output. ```bash ppanggolin align -p pangenome.h5 \ --sequences proteins.fasta \ --output alignment_results \ --getinfo ``` -------------------------------- ### ppanggolin.formats.writeBinaries Module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.formats.md This module contains functions for writing binary data. ```APIDOC ## ppanggolin.formats.writeBinaries Module ### Description Provides functionality for writing binary data. ### Members - writeBinaries(binaries, path) ### Parameters #### writeBinaries - **binaries** (list) - A list of binary data structures to write. - **path** (str) - The path to save the binary data. ``` -------------------------------- ### ppanggolin.annotate.synta module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.annotate.md Provides functionality for synteny analysis within the ppanggolin framework. ```APIDOC ## ppanggolin.annotate.synta ### Description This module provides tools and methods for performing synteny analysis on annotated genomic data. ``` -------------------------------- ### Custom Number of Partitions (K value) Source: https://context7.com/labgem/ppanggolin/llms.txt Specify a custom number of partitions (K value) for the pangenome partitioning process. Requires the pangenome file and the desired number of partitions. ```bash ppanggolin partition -p pangenome.h5 \ -K 3 \ --cpu 4 ``` -------------------------------- ### ppanggolin.formats.write_proksee Module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.formats.md This module is used for writing data in the Proksee format. ```APIDOC ## ppanggolin.formats.write_proksee Module ### Description Provides functionality for writing data in the Proksee format. ### Members - write_proksee(pangenome, path) ### Parameters #### write_proksee - **pangenome** (Pangenome object) - The pangenome object to write. - **path** (str) - The path to save the Proksee formatted data. ``` -------------------------------- ### Show Pangenome Metadata Information Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeInfo.md When metadata is available, the '--metadata' option displays which pangenome elements have associated metadata and their sources. ```bash ppanggolin info --metadata -p pangenome.h5 ``` -------------------------------- ### Force Specific Number of Partitions Source: https://context7.com/labgem/ppanggolin/llms.txt Force the partitioning process to use a specific number of partitions, overriding default behavior. Useful for ensuring a fixed number of partitions. ```bash ppanggolin partition -p pangenome.h5 \ -K 4 \ --force ``` -------------------------------- ### ppanggolin.formats.writeSequences Module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.formats.md This module provides functions for writing sequence data. ```APIDOC ## ppanggolin.formats.writeSequences Module ### Description Provides functionality for writing sequence data. ### Members - writeSequences(sequences, path) ### Parameters #### writeSequences - **sequences** (list) - A list of sequence objects to write. - **path** (str) - The path to save the sequence data. ``` -------------------------------- ### ppanggolin.metrics Module Contents Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.metrics.md API documentation for the top-level metrics module, detailing its members, undocumented members, and inheritance. ```APIDOC ## ppanggolin.metrics Module Contents ### Description API documentation for the top-level metrics module. ### Members (Details of members, undocumented members, and inheritance would be listed here if available in the source.) ``` -------------------------------- ### ppanggolin.info.info Module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.info.md Details of the ppanggolin.info.info module, including its members, undocumented members, and inheritance. ```APIDOC ## ppanggolin.info.info Module ### Description This section details the `ppanggolin.info.info` module, listing its members, undocumented members, and inheritance information. ### Members (Members are listed here based on the `automodule` directive) ### Undocumented Members (Undocumented members are listed here based on the `automodule` directive) ### Inheritance (Inheritance information is displayed here based on the `automodule` directive) ``` -------------------------------- ### ppanggolin.info Module Contents Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.info.md Overview of the ppanggolin.info module's contents, including its members, undocumented members, and inheritance. ```APIDOC ## ppanggolin.info Module Contents ### Description This section provides an overview of the `ppanggolin.info` module's contents, including its members, undocumented members, and inheritance information. ### Members (Members are listed here based on the `automodule` directive) ### Undocumented Members (Undocumented members are listed here based on the `automodule` directive) ### Inheritance (Inheritance information is displayed here based on the `automodule` directive) ``` -------------------------------- ### Generate Module Description Tables Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/Modules/moduleOutputs.md Use this command to generate TSV files describing functional modules, their presence in genomes, and a summary of module characteristics. Ensure the output directory exists. ```bash ppanggolin write_pangenome -p pangenome.h5 --modules -o my_output_dir ``` -------------------------------- ### Custom Module Detection Parameters Source: https://context7.com/labgem/ppanggolin/llms.txt Customize module detection parameters, including module size, minimum family count per partition, transitive closure, and Jaccard index threshold. These parameters refine module identification. ```bash ppanggolin module -p pangenome.h5 \ --size 3 \ --min_fam_in_partition 5 \ --transitive 4 \ --jaccard 0.85 ``` -------------------------------- ### Add Metadata to Pangenome Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/writeGenomes.md Command to associate a metadata file with a specific source in the pangenome database. ```bash ppanggolin metadata -p pangenome.h5 --source pfam --metadata family_pfam_annotation.tsv --assign families ``` -------------------------------- ### Build Pangenome Graph Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeGraph.md Use this subcommand to construct a pangenome graph from a .h5 file. It computes edges to link gene families based on genomic neighborhood. ```bash ppanggolin graph -p pangenome.h5 ``` -------------------------------- ### Run default gene clustering Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/PangenomeAnalyses/pangenomeCluster.md Executes the default clustering process using the generated pangenome HDF5 file. ```bash ppanggolin cluster -p pangenome.h5 ``` -------------------------------- ### ppanggolin.figures.draw_spot Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.figures.md Module for drawing spot-related visualizations in pangenome analysis. ```APIDOC ## ppanggolin.figures.draw_spot ### Description This module provides functionality to draw spots, which are typically used to represent clusters or specific genomic regions in pangenome visualizations. ``` -------------------------------- ### ppanggolin.formats.writeMetadata Module Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.formats.md This module contains functions for writing metadata. ```APIDOC ## ppanggolin.formats.writeMetadata Module ### Description Provides functionality for writing metadata. ### Members - writeMetadata(metadata, path) ### Parameters #### writeMetadata - **metadata** (dict) - A dictionary containing metadata. - **path** (str) - The path to save the metadata. ``` -------------------------------- ### Map Modules with Spots and RGPs Source: https://github.com/labgem/ppanggolin/blob/master/docs/user/Modules/moduleOutputs.md Execute this command to generate files that associate predicted modules with Spots of insertion and Regions of Genomic Plasticity (RGPs). This requires modules, spots, and RGPs to have been previously computed. ```bash ppanggolin write_pangenome -p pangenome.h5 --spot_modules -o my_output_dir ``` -------------------------------- ### Partitioning API Source: https://context7.com/labgem/ppanggolin/llms.txt The partition command analyzes a pangenome and partitions it into distinct clusters based on gene content. It supports custom number of partitions (K value) and forcing a specific number. ```APIDOC ## POST /partition ### Description Analyzes a pangenome and partitions it into distinct clusters based on gene content. ### Method POST ### Endpoint /partition ### Parameters #### Query Parameters - **-p** (string) - Required - Path to the pangenome file (e.g., pangenome.h5). - **-K** (integer) - Optional - Custom number of partitions (K value). - **--cpu** (integer) - Optional - Number of CPUs to use. - **--force** - Optional - Force specific number of partitions. ``` -------------------------------- ### Custom Spot Detection Parameters Source: https://context7.com/labgem/ppanggolin/llms.txt Customize spot detection parameters, such as set size, overlapping match threshold, and exact match size. These parameters control how RGPs are grouped into spots. ```bash ppanggolin spot -p pangenome.h5 \ --set_size 3 \ --overlapping_match 2 \ --exact_match_size 1 ``` -------------------------------- ### ppanggolin.nem Module Contents Source: https://github.com/labgem/ppanggolin/blob/master/docs/api/ppanggolin.nem.md Details the members, undocumented members, and inheritance of the main ppanggolin.nem module. ```APIDOC ## ppanggolin.nem Module Contents ### Description This section details the members, undocumented members, and inheritance of the main `ppanggolin.nem` module. ### Method N/A ### Endpoint N/A ### Parameters N/A ### Request Example N/A ### Response N/A ```