### Install CellMixS using BiocManager Source: https://github.com/almutlue/cellmixs/blob/devel/README.md This snippet shows how to install the CellMixS package from Bioconductor using the BiocManager in R. It first checks if BiocManager is installed and then proceeds to install CellMixS. ```r if (!requireNamespace("BiocManager")) install.packages("BiocManager") BiocManager::install("almutlue/CellMixS") ``` -------------------------------- ### Create Comprehensive Overview with visOverview Source: https://context7.com/almutlue/cellmixs/llms.txt Generates a multi-panel visualization combining batch labels, metric scores, and metadata to provide a holistic view of integration quality. ```R library(CellMixS) library(SingleCellExperiment) sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:30, 300:330)] sce_cms <- cms(sce, group = "batch", k = 20, n_dim = 2) visOverview(sce_cms, group = "batch") sce_cms$celltype <- sample(c("CD4+", "CD8+", "CD3"), ncol(sce_cms), replace = TRUE) visOverview(sce_cms, group = "batch", other_var = "celltype") sce_cms <- evalIntegration(c("isi", "entropy"), sce_cms, "batch", k = 20, n_dim = 2) visOverview(sce_cms, group = "batch", metric = c("cms_smooth", "isi", "entropy"), prefix = FALSE) visOverview(sce_cms, group = "batch", metric = "cms", log10_val = TRUE) ``` -------------------------------- ### visOverview Source: https://context7.com/almutlue/cellmixs/llms.txt Creates comprehensive overview plots showing batch labels, metric scores, and additional metadata side by side. ```APIDOC ## visOverview ### Description Create comprehensive overview plots showing batch labels, metric scores, and additional metadata side by side. Combines multiple visualizations for comparing batch distribution and metric scores in a single figure. ### Method (Implicitly a function call within R, not a standard HTTP method) ### Endpoint (Not applicable, this is an R function) ### Parameters #### Path Parameters (None) #### Query Parameters (None) #### Request Body (None) ### Request Example ```r library(CellMixS) library(SingleCellExperiment) # Load example data and calculate cms sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:30, 300:330)] sce_cms <- cms(sce, group = "batch", k = 20, n_dim = 2) # Basic overview with batch and cms visOverview(sce_cms, group = "batch") # Add additional metadata sce_cms$celltype <- sample(c("CD4+", "CD8+", "CD3"), ncol(sce_cms), replace = TRUE) visOverview(sce_cms, group = "batch", other_var = "celltype") # Show multiple metrics sce_cms <- evalIntegration(c("isi", "entropy"), sce_cms, "batch", k = 20, n_dim = 2) visOverview(sce_cms, group = "batch", metric = c("cms_smooth", "isi", "entropy"), prefix = FALSE) # Use -log10 scale for p-values visOverview(sce_cms, group = "batch", metric = "cms", log10_val = TRUE) ``` ### Response #### Success Response (200) (Generates a multi-panel plot visualization) #### Response Example (Visual plot output, no specific data structure to show) ``` -------------------------------- ### visIntegration Source: https://context7.com/almutlue/cellmixs/llms.txt Create summary plots comparing metric scores across different integration methods. Visualize and compare metric distributions using violin or ridge plots to evaluate integration quality. ```APIDOC ## visIntegration ### Description Create summary plots comparing metric scores across different integration methods. Visualize and compare metric distributions using violin or ridge plots to evaluate integration quality. ### Method `visIntegration(sce, metric, prefix = FALSE, violin = FALSE, metric_name = NULL)` ### Parameters #### Path Parameters None #### Query Parameters - **sce** (SingleCellExperiment or data.frame) - The input object containing the data. - **metric** (character) - The metric to visualize. If prefix is TRUE, this is the prefix of the metric column(s) in `colData(sce)`. - **prefix** (logical, optional) - If TRUE, `metric` is treated as a prefix for column names in `colData(sce)`. - **violin** (logical, optional) - If TRUE, use violin plots instead of ridge plots. - **metric_name** (character, optional) - A custom name for the metric being plotted. ### Request Example ```r library(CellMixS) library(SingleCellExperiment) sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[["batch20"]][, c(1:30, 300:320)] sce <- cms(sce, group = "batch", k = 20, dim_red = "MNN", res_name = "MNN", n_dim = 2) sce <- cms(sce, group = "batch", k = 20, dim_red = "PCA", res_name = "unaligned", n_dim = 2) # Compare integration methods with ridge plots (default) visIntegration(sce, metric = "cms.", prefix = TRUE) # Use violin plots visIntegration(sce, metric = "cms.", prefix = TRUE, violin = TRUE) # Compare ldfDiff scores visIntegration(sce, metric = "diff_ldf", metric_name = "ldfDiff") # Compare from data frame method_comparison <- data.frame( MNN = sce$cms.MNN, unaligned = sce$cms.unaligned ) visIntegration(method_comparison, metric_name = "CMS") ``` ### Response Returns a ggplot2 object representing the visualization. #### Success Response (200) - **ggplot2 object**: A plot comparing metric scores across integration methods. #### Response Example (A ggplot2 object is returned, not a JSON example) ``` -------------------------------- ### Visualize Metric Distributions with visHist Source: https://context7.com/almutlue/cellmixs/llms.txt Generates p-value histograms for metric scores to assess batch mixing. A flat distribution suggests good mixing, while peaks at low values indicate potential batch effects. ```R library(CellMixS) library(SingleCellExperiment) sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:50)] sce_cms <- cms(sce, group = "batch", k = 20, n_dim = 2) visHist(sce_cms) visHist(sce_cms, metric = "cms", n_col = 2) cms_mat <- data.frame( batch0 = sce_cms$cms, batch20 = runif(ncol(sce_cms)), batch50 = rbeta(ncol(sce_cms), 0.5, 2) ) visHist(cms_mat, n_col = 3) ``` -------------------------------- ### evalIntegration - Evaluate Data Integration Metrics Source: https://context7.com/almutlue/cellmixs/llms.txt A wrapper function to evaluate data integration using multiple metrics including cms, ldfDiff, entropy, inverse Simpson index (isi), mixingMetric, and localStructure. It provides a unified interface for computing various batch mixing and structure preservation metrics. ```APIDOC ## evalIntegration ### Description Wrapper function to evaluate data integration using multiple metrics including cms, ldfDiff, entropy, inverse Simpson index (isi), mixingMetric, and localStructure. This function provides a unified interface for computing multiple batch mixing and structure preservation metrics. Mixing metrics (cms, isi, entropy, mixingMetric) evaluate how well batches are mixed after integration. Structure metrics (ldfDiff, localStructure) evaluate how well the integration preserves within-batch relationships. ### Method `evalIntegration` ### Parameters - **metrics** (character vector) - Names of the metrics to compute. Can include "cms", "isi", "entropy", "mixingMetric", "ldfDiff", "localStructure". - **sce** (SingleCellExperiment) - The SingleCellExperiment object containing the data and potentially integrated embeddings. - **group** (character) - The column name in `colData(sce)` that specifies the batch or group information. - **k** (integer) - The number of nearest neighbors to consider for metric calculation. - **n_dim** (integer, optional) - The number of dimensions to use for dimensionality reduction if needed (e.g., for PCA). - **cell_min** (integer, optional) - Minimum number of cells required in a batch for certain calculations. - **sce_pre_list** (list, optional) - A list of SingleCellExperiment objects representing pre-integration data, required for metrics like "ldfDiff". The list should be named according to the group levels. - **dim_combined** (character, optional) - The name of the integrated dimensionality reduction (e.g., "MNN") to use for structure preservation metrics. - **assay_name** (character, optional) - The name of the assay to use for calculations, especially for structure preservation metrics. - **res_name** (character vector, optional) - Custom names for the resulting metric columns in `colData(sce)`. ### Request Example ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:15, 300:320, 16:30)] # Calculate multiple mixing metrics at once sce <- evalIntegration( metrics = c("cms", "isi", "entropy", "mixingMetric"), sce = sce, group = "batch", k = 20, n_dim = 2, cell_min = 4 ) # Calculate ldfDiff metric (requires pre-integration data) sce_batch1 <- sce[, colData(sce)$batch == "1"] sce_batch2 <- sce[, colData(sce)$batch == "2"] pre <- list("1" = sce_batch1, "2" = sce_batch2) sce <- evalIntegration( metrics = "ldfDiff", sce = sce, group = "batch", k = 20, sce_pre_list = pre ) # Calculate localStructure metric (requires integrated embedding) sce <- evalIntegration( metrics = "localStructure", sce = sce, group = "batch", k = 20, dim_combined = "MNN", assay_name = "counts" ) # Combine multiple metrics with custom result names sce <- evalIntegration( metrics = c("isi", "entropy"), sce = sce, group = "batch", k = 30, n_dim = 2, res_name = c("weighted_isi", "batch_entropy") ) ``` ### Response - **sce** (SingleCellExperiment) - The input SingleCellExperiment object with added columns in `colData(sce)` containing the calculated metric scores. ``` -------------------------------- ### Visualize Integration Method Performance Source: https://context7.com/almutlue/cellmixs/llms.txt Compares metric scores across different integration methods using ridge or violin plots. It accepts either a SingleCellExperiment object or a data frame containing metric values. ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[["batch20"]][, c(1:30, 300:320)] # Run cms with different methods sce <- cms(sce, group = "batch", k = 20, dim_red = "MNN", res_name = "MNN", n_dim = 2) sce <- cms(sce, group = "batch", k = 20, dim_red = "PCA", res_name = "unaligned", n_dim = 2) # Compare integration methods with ridge plots (default) visIntegration(sce, metric = "cms.", prefix = TRUE) # Use violin plots visIntegration(sce, metric = "cms.", prefix = TRUE, violin = TRUE) # Compare ldfDiff scores visIntegration(sce, metric = "diff_ldf", metric_name = "ldfDiff") # Compare from data frame method_comparison <- data.frame( MNN = sce$cms.MNN, unaligned = sce$cms.unaligned ) visIntegration(method_comparison, metric_name = "CMS") ``` -------------------------------- ### Visualize Metric Scores by Cluster Source: https://context7.com/almutlue/cellmixs/llms.txt Compares metric distributions across cell types, clusters, or batch variables to identify systematic differences in integration quality. ```r library(CellMixS) library(SingleCellExperiment) # Load example data and calculate cms sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:30, 300:320)] sce_cms <- cms(sce, group = "batch", k = 20, n_dim = 2) # Add celltype labels sce_cms$celltype <- rep(c("CD4+", "CD8+", "CD3"), length.out = ncol(sce_cms)) # Compare cms scores across celltypes (ridge plot) visCluster(sce_cms, cluster_var = "celltype", metric_var = "cms") # Use violin plot visCluster(sce_cms, cluster_var = "celltype", metric_var = "cms", violin = TRUE) # Compare across batches visCluster(sce_cms, cluster_var = "batch", metric_var = "cms_smooth") ``` -------------------------------- ### Calculate Cellspecific Mixing Score (cms) Source: https://github.com/almutlue/cellmixs/blob/devel/README.md Demonstrates how to calculate the Cellspecific Mixing Score (cms) using the `cms` function from CellMixS. This metric helps in detecting batch effects within k-nearest neighboring cells. It requires a SingleCellExperiment object, the number of neighbors (k), and the batch variable. ```r sce_cms <- cms(sce, k = 70, group = "batch") ``` -------------------------------- ### locStructure Source: https://context7.com/almutlue/cellmixs/llms.txt Calculates the local structure metric by comparing k-nearest neighbor overlap before and after integration within each batch. ```APIDOC ## locStructure ### Description Calculate the local structure metric that compares k-nearest neighbor overlap before and after integration within each batch. For each batch, this function calculates k-nearest neighbors within PCA space before integration and compares them to the knn within the integrated representation. The score represents the proportion of overlapping neighbors - higher values indicate better preservation of local structure. ### Method (Implicitly a function call within R, not a standard HTTP method) ### Endpoint (Not applicable, this is an R function) ### Parameters #### Path Parameters (None) #### Query Parameters (None) #### Request Body (None) ### Request Example ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[["batch20"]][, c(1:50, 300:350)] # Calculate local structure preservation # Compares PCA neighbors before integration to MNN neighbors after sce <- locStructure( sce, group = "batch", dim_combined = "MNN", # integrated embedding k = 20, dim_red = "PCA", # original embedding assay_name = "counts", n_dim = 10, n_combined = 10 ) ``` ### Response #### Success Response (200) (The function modifies the input SingleCellExperiment object in place, adding an 'overlap' column to colData) #### Response Example ```r # Access overlap scores (higher = better structure preservation) head(colData(sce)$overlap) #> [1] 0.85 0.90 0.78 0.82 0.88 0.75 ``` ``` -------------------------------- ### Evaluate Data Integration with CellMixS Source: https://context7.com/almutlue/cellmixs/llms.txt The evalIntegration function provides a unified interface for computing multiple batch mixing and structure preservation metrics. Mixing metrics evaluate how well batches are mixed after integration, while structure metrics evaluate how well the integration preserves within-batch relationships. It supports various metrics and allows for custom result naming. ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:15, 300:320, 16:30)] # Calculate multiple mixing metrics at once sce <- evalIntegration( metrics = c("cms", "isi", "entropy", "mixingMetric"), sce = sce, group = "batch", k = 20, n_dim = 2, cell_min = 4 ) # Check available metrics in colData names(colData(sce)) # Calculate ldfDiff metric (requires pre-integration data) sce_batch1 <- sce[, colData(sce)$batch == "1"] sce_batch2 <- sce[, colData(sce)$batch == "2"] pre <- list("1" = sce_batch1, "2" = sce_batch2) sce <- evalIntegration( metrics = "ldfDiff", sce = sce, group = "batch", k = 20, sce_pre_list = pre ) # Calculate localStructure metric (requires integrated embedding) sce <- evalIntegration( metrics = "localStructure", sce = sce, group = "batch", k = 20, dim_combined = "MNN", assay_name = "counts" ) # Combine multiple metrics with custom result names sce <- evalIntegration( metrics = c("isi", "entropy"), sce = sce, group = "batch", k = 30, n_dim = 2, res_name = c("weighted_isi", "batch_entropy") ) ``` -------------------------------- ### Calculate Local Density Factor Differences (ldfDiff) Source: https://context7.com/almutlue/cellmixs/llms.txt Compute ldfDiff to measure how data integration methods alter batch-internal structures. Scores close to 0 indicate that the integration method successfully preserved the local density structure. ```R library(CellMixS) library(SingleCellExperiment) sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[["batch20"]] sce_pre_list <- list( "1" = sce[, colData(sce)$batch == "1"], "2" = sce[, colData(sce)$batch == "2"], "3" = sce[, colData(sce)$batch == "3"] ) sce_ldf <- ldfDiff( sce_pre_list, sce_combined = sce, group = "batch", k = 70, dim_red = "PCA", dim_combined = "MNN", assay_pre = "counts", n_dim = 3, res_name = "MNN" ) ``` -------------------------------- ### visHist Source: https://context7.com/almutlue/cellmixs/llms.txt Plots p-value histograms of metric score distributions to assess overall batch mixing. ```APIDOC ## visHist ### Description Plot p-value histograms of metric score distributions to assess overall batch mixing. For cms scores (which are p-values from the Anderson-Darling test), a flat histogram indicates random batch mixing (no batch effect), while a peak at low values indicates batch-specific bias. ### Method (Implicitly a function call within R, not a standard HTTP method) ### Endpoint (Not applicable, this is an R function) ### Parameters #### Path Parameters (None) #### Query Parameters (None) #### Request Body (None) ### Request Example ```r library(CellMixS) library(SingleCellExperiment) # Load example data and calculate cms sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:50)] sce_cms <- cms(sce, group = "batch", k = 20, n_dim = 2) # Plot histogram of cms scores visHist(sce_cms) # Plot multiple metrics side by side visHist(sce_cms, metric = "cms", n_col = 2) # Compare cms from different integration methods # (after running cms with different res_name values) visHist(sce_cms, metric = "cms.", prefix = TRUE, n_col = 3) # Plot from matrix/data.frame cms_mat <- data.frame( batch0 = sce_cms$cms, batch20 = runif(ncol(sce_cms)), batch50 = rbeta(ncol(sce_cms), 0.5, 2) ) visHist(cms_mat, n_col = 3) ``` ### Response #### Success Response (200) (Generates a histogram plot) #### Response Example (Visual plot output, no specific data structure to show) ``` -------------------------------- ### Calculate Local Density Differences (ldfDiff) Source: https://github.com/almutlue/cellmixs/blob/devel/README.md Illustrates the calculation of Local Density Differences (ldfDiff) using the `ldfDiff` function. This score evaluates the change in relative local cell densities after data integration. It requires both unaligned and aligned SingleCellExperiment objects, the batch variable, and the number of neighbors (k). ```r sce_ldf <- ldfDiff(sce_pre_list, sce_combined, group = "batch", k = 70) ``` -------------------------------- ### Calculate Local Structure Preservation with locStructure Source: https://context7.com/almutlue/cellmixs/llms.txt Computes the local structure metric by comparing k-nearest neighbor overlap before and after data integration. Higher values indicate better preservation of the original local structure. ```R library(CellMixS) library(SingleCellExperiment) sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[["batch20"]][, c(1:50, 300:350)] sce <- locStructure( sce, group = "batch", dim_combined = "MNN", k = 20, dim_red = "PCA", assay_name = "counts", n_dim = 10, n_combined = 10 ) head(colData(sce)$overlap) ``` -------------------------------- ### visCluster Source: https://context7.com/almutlue/cellmixs/llms.txt Create summary plots of metric scores grouped by clusters or other categorical variables. Compare metric distributions across celltypes, clusters, or other grouping variables to identify systematic differences. ```APIDOC ## visCluster ### Description Create summary plots of metric scores grouped by clusters or other categorical variables. Compare metric distributions across celltypes, clusters, or other grouping variables to identify systematic differences. ### Method `visCluster(sce, cluster_var, metric_var, violin = FALSE)` ### Parameters #### Path Parameters None #### Query Parameters - **sce** (SingleCellExperiment) - The input SingleCellExperiment object. - **cluster_var** (character) - The column name in `colData(sce)` to use for grouping. - **metric_var** (character) - The column name in `colData(sce)` containing the metric scores to plot. - **violin** (logical, optional) - If TRUE, use violin plots instead of ridge plots. ### Request Example ```r library(CellMixS) library(SingleCellExperiment) sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:30, 300:320)] sce_cms <- cms(sce, group = "batch", k = 20, n_dim = 2) # Add celltype labels sce_cms$celltype <- rep(c("CD4+", "CD8+", "CD3"), length.out = ncol(sce_cms)) # Compare cms scores across celltypes (ridge plot) visCluster(sce_cms, cluster_var = "celltype", metric_var = "cms") # Use violin plot visCluster(sce_cms, cluster_var = "celltype", metric_var = "cms", violin = TRUE) # Compare across batches visCluster(sce_cms, cluster_var = "batch", metric_var = "cms_smooth") ``` ### Response Returns a ggplot2 object representing the visualization. #### Success Response (200) - **ggplot2 object**: A plot comparing metric scores across different clusters or groups. #### Response Example (A ggplot2 object is returned, not a JSON example) ``` -------------------------------- ### mixMetric - Calculate Seurat's Mixing Metric Source: https://context7.com/almutlue/cellmixs/llms.txt Calculates Seurat's mixing metric, which uses the median rank of the k-th cell from each batch within the k-nearest neighbors. Lower values indicate better mixing. ```APIDOC ## mixMetric ### Description Calculate Seurat's mixing metric which uses the median rank of the kth cell from each batch within k-nearest neighbors. The mixing metric takes the median rank of the k_pos neighbor from each batch as an estimation for the data's entropy according to the batch variable. Lower values indicate better mixing. ### Method `mixMetric` ### Parameters - **sce** (SingleCellExperiment) - The SingleCellExperiment object. - **group** (character) - The column name in `colData(sce)` that specifies the batch or group information. - **k** (integer) - The size of the neighborhood to search for neighbors. - **k_pos** (integer, optional) - The rank position of the cell within the neighborhood to use for scoring. Defaults to 5. ### Request Example ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:15, 400:420, 16:30)] # Calculate mixing metric # k: large neighborhood to search # k_pos: position of cell to use for scoring (default 5) sce <- mixMetric(sce, group = "batch", k = 300, k_pos = 5) # Access mixing metric scores (lower = better mixed) head(colData(sce)$mm) ``` ### Response - **sce** (SingleCellExperiment) - The input SingleCellExperiment object with an added column (named 'mm' by default) in `colData(sce)` containing the calculated mixing metric scores. ``` -------------------------------- ### Visualize Metric Scores with visMetric Source: https://context7.com/almutlue/cellmixs/llms.txt Maps cell-specific metric scores onto a reduced dimensional plot to identify spatial regions of batch bias within the dataset. ```R library(CellMixS) library(SingleCellExperiment) sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:30, 300:320)] sce_cms <- cms(sce, group = "batch", k = 20, n_dim = 2) visMetric(sce_cms, metric_var = "cms") visMetric(sce_cms, metric_var = "cms_smooth") visMetric(sce_cms, metric_var = "cms", log10_val = TRUE) visMetric(sce_cms, metric_var = "cms", dim_red = "MNN") ``` -------------------------------- ### entropy - Calculate Shannon Entropy for Mixing Randomness Source: https://context7.com/almutlue/cellmixs/llms.txt Calculates the Shannon entropy of the batch variable within each cell's k-nearest neighborhood. Higher entropy values generally indicate better mixing randomness. ```APIDOC ## entropy ### Description Calculate Shannon entropy of the batch variable within each cell's k-nearest neighborhood to measure mixing randomness. For balanced batches, entropy close to 1 indicates high randomness and good mixing. For unbalanced batches, entropy should be interpreted with caution but can work as a relative metric in comparative settings. ### Method `entropy` ### Parameters - **sce** (SingleCellExperiment) - The SingleCellExperiment object. - **group** (character) - The column name in `colData(sce)` that specifies the batch or group information. - **k** (integer) - The number of nearest neighbors to consider. - **dim_red** (character, optional) - The dimensionality reduction method to use (e.g., "PCA", "MNN"). Defaults to using the first two dimensions if not specified. - **n_dim** (integer, optional) - The number of dimensions to use from the specified `dim_red`. - **res_name** (character, optional) - Custom name for the resulting entropy score column in `colData(sce)`. ### Request Example ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:15, 400:420, 16:30)] # Calculate entropy within k-nearest neighbors sce <- entropy(sce, group = "batch", k = 20) # Access entropy scores head(colData(sce)$entropy) # Calculate entropy with custom parameters sce <- entropy( sce, group = "batch", k = 30, dim_red = "PCA", n_dim = 10, res_name = "entropy_k30" ) ``` ### Response - **sce** (SingleCellExperiment) - The input SingleCellExperiment object with an added column (named 'entropy' by default) in `colData(sce)` containing the calculated entropy scores. ``` -------------------------------- ### Visualize Grouping Variables with visGroup Source: https://context7.com/almutlue/cellmixs/llms.txt Plots batch or cell-type labels in a reduced dimensional space like TSNE. It automatically computes embeddings if they are not already present in the object. ```R library(CellMixS) library(SingleCellExperiment) sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:50, 300:350)] visGroup(sce, group = "batch") visGroup(sce, group = "batch", dim_red = "MNN") sce$celltype <- sample(c("CD4+", "CD8+", "B"), ncol(sce), replace = TRUE) visGroup(sce, group = "celltype") ``` -------------------------------- ### Calculate Shannon Entropy for Batch Mixing Source: https://context7.com/almutlue/cellmixs/llms.txt The entropy function calculates the Shannon entropy of the batch variable within each cell's k-nearest neighborhood. This metric measures mixing randomness, where values close to 1 indicate high randomness and good mixing for balanced batches. It can be calculated with custom parameters for neighborhood size and dimensionality reduction. ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:15, 400:420, 16:30)] # Calculate entropy within k-nearest neighbors sce <- entropy(sce, group = "batch", k = 20) # Access entropy scores head(colData(sce)$entropy) # Calculate entropy with custom parameters sce <- entropy( sce, group = "batch", k = 30, dim_red = "PCA", n_dim = 10, res_name = "entropy_k30" ) ``` -------------------------------- ### Calculate Seurat's Mixing Metric Source: https://context7.com/almutlue/cellmixs/llms.txt The mixMetric function calculates Seurat's mixing metric, which uses the median rank of the kth cell from each batch within k-nearest neighbors. This metric estimates the data's entropy according to the batch variable, with lower values indicating better mixing. It requires specifying the neighborhood size (k) and the position of the cell to use for scoring (k_pos). ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:15, 400:420, 16:30)] # Calculate mixing metric # k: large neighborhood to search # k_pos: position of cell to use for scoring (default 5) sce <- mixMetric(sce, group = "batch", k = 300, k_pos = 5) # Access mixing metric scores (lower = better mixed) head(colData(sce)$mm) ``` -------------------------------- ### isi - Calculate Inverse Simpson Index for Batch Mixing Source: https://context7.com/almutlue/cellmixs/llms.txt Calculates the inverse Simpson index of the batch variable within each cell's k-nearest neighborhood. This metric represents the effective number of batches in a neighborhood, with higher values indicating better mixing. ```APIDOC ## isi ### Description Calculate the inverse Simpson index of the batch variable within each cell's k-nearest neighborhood to measure effective number of batches. The inverse Simpson index represents the effective number of batches in a neighborhood. This metric was proposed by Korsunsky et al. and can optionally use distance-based weighting where batch probabilities are weighted by the mean distance of their cells towards the cell of interest. ### Method `isi` ### Parameters - **sce** (SingleCellExperiment) - The SingleCellExperiment object. - **group** (character) - The column name in `colData(sce)` that specifies the batch or group information. - **k** (integer) - The number of nearest neighbors to consider. - **weight** (logical, optional) - Whether to use distance-based weighting. Defaults to TRUE. - **res_name** (character, optional) - Custom name for the resulting inverse Simpson index column in `colData(sce)`. ### Request Example ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:15, 400:420, 16:30)] # Calculate inverse Simpson index with distance weighting (default) sce <- isi(sce, group = "batch", k = 20, weight = TRUE) # Calculate without distance weighting sce <- isi(sce, group = "batch", k = 20, weight = FALSE, res_name = "isi_unweighted") # Access isi scores head(colData(sce)[, c("isi", "isi_unweighted")]) ``` ### Response - **sce** (SingleCellExperiment) - The input SingleCellExperiment object with an added column (named 'isi' by default) in `colData(sce)` containing the calculated inverse Simpson index scores. ``` -------------------------------- ### visMetric Source: https://context7.com/almutlue/cellmixs/llms.txt Plots metric scores in a reduced dimensional representation. ```APIDOC ## visMetric ### Description Plot metric scores in a reduced dimensional representation. Visualize cell-specific metric scores (cms, entropy, etc.) using dimensionality reduction to identify regions with batch bias. ### Method (Implicitly a function call within R, not a standard HTTP method) ### Endpoint (Not applicable, this is an R function) ### Parameters #### Path Parameters (None) #### Query Parameters (None) #### Request Body (None) ### Request Example ```r library(CellMixS) library(SingleCellExperiment) # Load example data and calculate cms sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:30, 300:320)] sce_cms <- cms(sce, group = "batch", k = 20, n_dim = 2) # Visualize cms scores visMetric(sce_cms, metric_var = "cms") # Visualize smoothened cms scores visMetric(sce_cms, metric_var = "cms_smooth") # Use -log10 transformation for better visualization of low scores visMetric(sce_cms, metric_var = "cms", log10_val = TRUE) # Use different embedding visMetric(sce_cms, metric_var = "cms", dim_red = "MNN") ``` ### Response #### Success Response (200) (Generates a scatter plot visualization) #### Response Example (Visual plot output, no specific data structure to show) ``` -------------------------------- ### Calculate Local Density Factor Differences Source: https://context7.com/almutlue/cellmixs/llms.txt Computes Local Density Factor (LDF) differences for a specific group element within a dataset, providing a granular look at integration performance. ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[["batch20"]][, c(1:50, 300:350)] # Prepare pre-integration data sce_batch1 <- sce[, colData(sce)$batch == "1"] sce_pre_list <- list("1" = sce_batch1) # Calculate ldfDiff for batch "1" only ldf_1 <- ldfSce( sce_name = "1", sce_pre_list = sce_pre_list, sce_combined = sce, group = "batch", k = 10, dim_combined = "MNN", n_dim = 5 ) # Result is a data.frame with diff_ldf column head(ldf_1) ``` -------------------------------- ### ldfSce Source: https://context7.com/almutlue/cellmixs/llms.txt Calculate Local Density Factor differences for a specific batch/group within the dataset. This is a lower-level function called by ldfDiff. ```APIDOC ## ldfSce ### Description Calculate Local Density Factor differences for a specific batch/group within the dataset. Lower-level function called by ldfDiff that computes LDF differences for one specific group element. ### Method `ldfSce(sce_name, sce_pre_list, sce_combined, group, k, dim_combined, n_dim)` ### Parameters #### Path Parameters None #### Query Parameters - **sce_name** (character) - The name of the specific group element for which to calculate LDF differences. - **sce_pre_list** (list) - A list where names are group elements and values are SingleCellExperiment objects representing the pre-integration data for that group. - **sce_combined** (SingleCellExperiment) - The combined SingleCellExperiment object after integration. - **group** (character) - The column name in `colData(sce)` that defines the groups. - **k** (integer) - The number of nearest neighbors to consider for LDF calculation. - **dim_combined** (character) - The dimensionality reduction method used for the combined data (e.g., "MNN", "PCA"). - **n_dim** (integer) - The number of dimensions used in the dimensionality reduction. ### Request Example ```r library(CellMixS) library(SingleCellExperiment) sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[["batch20"]][, c(1:50, 300:350)] # Prepare pre-integration data sce_batch1 <- sce[, colData(sce)$batch == "1"] sce_pre_list <- list("1" = sce_batch1) # Calculate ldfDiff for batch "1" only ldf_1 <- ldfSce( sce_name = "1", sce_pre_list = sce_pre_list, sce_combined = sce, group = "batch", k = 10, dim_combined = "MNN", n_dim = 5 ) # Result is a data.frame with diff_ldf column head(ldf_1) ``` ### Response Returns a data frame containing the Local Density Factor differences. #### Success Response (200) - **data.frame**: A data frame with a column named `diff_ldf` containing the calculated LDF differences for the specified group. #### Response Example ```json { "diff_ldf": [ 0.0234, -0.1234, 0.0567 ] } ``` ``` -------------------------------- ### Calculate Cell-specific Mixing Score (CMS) Source: https://context7.com/almutlue/cellmixs/llms.txt Compute CMS to evaluate batch mixing using the Anderson-Darling test. This function identifies batch-specific bias by comparing distance distributions of k-nearest neighbors. ```R library(CellMixS) library(SingleCellExperiment) sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[["batch50"]] # Calculate cell-specific mixing score sce_cms <- cms(sce, k = 30, group = "batch", n_dim = 3, cell_min = 4) # Access results head(colData(sce_cms)[, c("cms", "cms_smooth")]) ``` -------------------------------- ### visGroup Source: https://context7.com/almutlue/cellmixs/llms.txt Plots group/batch labels in a reduced dimensional representation (TSNE by default). ```APIDOC ## visGroup ### Description Plot group/batch labels in a reduced dimensional representation (TSNE by default). Visualize batch distribution in the dataset using dimensionality reduction. TSNE embeddings are automatically computed if not present. ### Method (Implicitly a function call within R, not a standard HTTP method) ### Endpoint (Not applicable, this is an R function) ### Parameters #### Path Parameters (None) #### Query Parameters (None) #### Request Body (None) ### Request Example ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:50, 300:350)] # Visualize batch distribution using TSNE (computed automatically) visGroup(sce, group = "batch") # Use existing embedding (e.g., MNN integrated space) visGroup(sce, group = "batch", dim_red = "MNN") # Visualize other grouping variables sce$celltype <- sample(c("CD4+", "CD8+", "B"), ncol(sce), replace = TRUE) visGroup(sce, group = "celltype") ``` ### Response #### Success Response (200) (Generates a scatter plot visualization) #### Response Example (Visual plot output, no specific data structure to show) ``` -------------------------------- ### Calculate Inverse Simpson Index for Batch Mixing Source: https://context7.com/almutlue/cellmixs/llms.txt The isi function calculates the inverse Simpson index of the batch variable within each cell's k-nearest neighborhood, representing the effective number of batches. It can optionally use distance-based weighting, where batch probabilities are weighted by the mean distance of their cells towards the cell of interest. This metric helps quantify batch mixing. ```r library(CellMixS) library(SingleCellExperiment) # Load example data sim_list <- readRDS(system.file("extdata/sim50.rds", package = "CellMixS")) sce <- sim_list[[1]][, c(1:15, 400:420, 16:30)] # Calculate inverse Simpson index with distance weighting (default) sce <- isi(sce, group = "batch", k = 20, weight = TRUE) # Calculate without distance weighting sce <- isi(sce, group = "batch", k = 20, weight = FALSE, res_name = "isi_unweighted") # Access isi scores head(colData(sce)[, c("isi", "isi_unweighted")]) ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.