Compute sample-level spillover summary scores
Source:R/questions.R
summarize_spillover_per_sample.Rd
Aggregates expression data using gene-level spillover weights. Accepts either a single matrix/vector or lists of both for timecourse-style analysis (e.g., paired with `spillover_timecourse()`).
Usage
summarize_spillover_per_sample(
expr,
spillover,
summary_fn = NULL,
normalize_scores = FALSE,
genes_in_rows = FALSE,
abs_expression = FALSE,
verbose = TRUE
)
Arguments
- expr
Either a matrix/data.frame (genes × samples) or a list of such matrices.
- spillover
Either a named numeric vector of spillover scores or a list of such vectors.
- summary_fn
Optional custom function: takes two arguments (expr, spillover) and returns a scalar. Default is a weighted sum of expr × spillover.
- normalize_scores
Logical. Whether to normalize spillover scores (ie return spillover Z scores) (default = FALSE).
- genes_in_rows
Logical. TRUE if rows are genes and columns are samples (default = FALSE).
- abs_expression
Logical. If TRUE, take absolute value of expression before summarizing (default = FALSE).
- verbose
Logical. Whether to print progress and diagnostics (default = TRUE).
#' @details **Interpreting Per-Sample Spillover Scores**
This function computes a summary score for each sample, reflecting how strongly the sample expresses genes that were influenced by network-based spillover from upstream signals (e.g., hub genes or initiators).
The default scoring function calculates a weighted sum:
$$ \text{score}_s = \sum_{g \in G} \text{expression}_{gs} \times \text{spillover}_g $$
where: - \(G\) is the set of genes with non-zero spillover - \(\text{expression}_{gs}\) is the expression of gene *g* in sample *s* - \(\text{spillover}_g\) is the spillover score assigned to gene *g* at that timepoint
This score can be interpreted as the degree to which a sample expresses the genes that received signal via network propagation. Higher values suggest a stronger manifestation of upstream network activity.
**Z-score Option (`normalize_samples = TRUE`):** When this is enabled, the resulting spillover scores are standardized across samples:
$$ Z_s = \frac{\text{score}_s - \mu}{\sigma} $$
where \(\mu\) and \(\sigma\) are the mean and standard deviation of spillover scores across all samples. Z-scores are useful for comparing samples across different datasets or timepoints, or when integrating with other predictors.
**Absolute Expression Option (`abs_expression = TRUE`):** This flag takes the absolute value of expression values before applying the spillover weights. This helps avoid signal cancellation if the expression data has been centered, scaled, or otherwise contains negative values. It is especially useful when the presence or magnitude of expression is of interest, rather than its direction.