Skip to contents

Aggregates expression data using gene-level spillover weights. Accepts either a single matrix/vector or lists of both for timecourse-style analysis (e.g., paired with `spillover_timecourse()`).

Usage

summarize_spillover_per_sample(
  expr,
  spillover,
  summary_fn = NULL,
  normalize_scores = FALSE,
  genes_in_rows = FALSE,
  abs_expression = FALSE,
  verbose = TRUE
)

Arguments

expr

Either a matrix/data.frame (genes × samples) or a list of such matrices.

spillover

Either a named numeric vector of spillover scores or a list of such vectors.

summary_fn

Optional custom function: takes two arguments (expr, spillover) and returns a scalar. Default is a weighted sum of expr × spillover.

normalize_scores

Logical. Whether to normalize spillover scores (ie return spillover Z scores) (default = FALSE).

genes_in_rows

Logical. TRUE if rows are genes and columns are samples (default = FALSE).

abs_expression

Logical. If TRUE, take absolute value of expression before summarizing (default = FALSE).

verbose

Logical. Whether to print progress and diagnostics (default = TRUE).

#' @details **Interpreting Per-Sample Spillover Scores**

This function computes a summary score for each sample, reflecting how strongly the sample expresses genes that were influenced by network-based spillover from upstream signals (e.g., hub genes or initiators).

The default scoring function calculates a weighted sum:

$$ \text{score}_s = \sum_{g \in G} \text{expression}_{gs} \times \text{spillover}_g $$

where: - \(G\) is the set of genes with non-zero spillover - \(\text{expression}_{gs}\) is the expression of gene *g* in sample *s* - \(\text{spillover}_g\) is the spillover score assigned to gene *g* at that timepoint

This score can be interpreted as the degree to which a sample expresses the genes that received signal via network propagation. Higher values suggest a stronger manifestation of upstream network activity.

**Z-score Option (`normalize_samples = TRUE`):** When this is enabled, the resulting spillover scores are standardized across samples:

$$ Z_s = \frac{\text{score}_s - \mu}{\sigma} $$

where \(\mu\) and \(\sigma\) are the mean and standard deviation of spillover scores across all samples. Z-scores are useful for comparing samples across different datasets or timepoints, or when integrating with other predictors.

**Absolute Expression Option (`abs_expression = TRUE`):** This flag takes the absolute value of expression values before applying the spillover weights. This helps avoid signal cancellation if the expression data has been centered, scaled, or otherwise contains negative values. It is especially useful when the presence or magnitude of expression is of interest, rather than its direction.

Value

A named numeric vector (if single input) or named list of per-timepoint score vectors.

Examples

if (FALSE) { # \dontrun{
# Single timepoint
scores <- summarize_spillover_per_sample(expr, spillover_vec)

# Timecourse
scores_list <- summarize_spillover_per_sample(expr_list, spillover_list)
} # }