Skip to contents

This function calculates a spillover summary score for each sample, grouped by gene module, across a timecourse of expression matrices and spillover vectors. It supports optional absolute-value transformation of expression and z-score normalization of the final result.

Usage

summarize_spillover_by_sample_and_module(
  expr_list,
  spillover_list,
  module_membership_list,
  summary_func = NULL,
  genes_in_rows = FALSE,
  abs_expression = FALSE,
  normalize_scores = FALSE,
  verbose = TRUE
)

Arguments

expr_list

A list of gene expression matrices, one per timepoint (samples × genes). Must be in the same order as `spillover_list`.

spillover_list

A list of named numeric vectors of spillover scores, one per timepoint. Names must match gene names in expression and module inputs.

module_membership_list

A list of named vectors mapping genes to modules (character or factor), one per timepoint.

summary_func

Optional. A function to compute the summary statistic (default = sum(expr × spillover)). The function must accept two numeric vectors: expression and spillover.

genes_in_rows

Logical. TRUE if rows are genes and columns are samples (default = FALSE).

abs_expression

Logical. If TRUE, take absolute value of expression before applying `summary_func` (default = FALSE).

normalize_scores

Logical. If TRUE, z-score standardize the spillover scores across samples (per module) (default = FALSE).

verbose

Logical. If TRUE, print progress messages (default = TRUE).

Value

A list of data frames, one per timepoint. Each data frame is samples × modules, with spillover summary scores.

Details

This enables downstream analyses such as: - Tracing how module-specific signal propagates through time - Associating spillover magnitude with phenotypes - Contrasting signal distribution across modules and samples

**Interpreting Spillover Summary Values**

This function calculates a summary score for each sample that reflects the *weighted influence* of genes receiving spillover signal, aggregated within each module. The default score is computed as:

$$ \text{score}_{is} = \sum_{g \in M_i} \text{expression}_{sg} \times \text{spillover}_g $$

where: - \(M_i\) is the set of genes in module *i* - \(\text{expression}_{sg}\) is the expression of gene *g* in sample *s* - \(\text{spillover}_g\) is the spillover score assigned to gene *g* at that timepoint

The result reflects how strongly a sample expresses genes that were targeted by spillover, and thus may indicate whether the biological effects of an upstream perturbation are active in that sample.

**Z-score Option (`normalize_scores = TRUE`):** When enabled, scores are standardized *per module* (i.e., column-wise) across all samples:

$$ Z_{is} = \frac{\text{score}_{is} - \mu_i}{\sigma_i} $$

where \(\mu_i\) and \(\sigma_i\) are the mean and standard deviation of scores for module *i* across samples. Z-scoring is useful when comparing modules with different dynamic ranges, or when integrating scores across timepoints.

**Absolute Expression Option (`abs_expression = TRUE`):** When this option is set, gene expression values are converted to absolute values before weighting. This can help avoid cancellation effects if the expression matrix has been centered or includes negative values (e.g., z-scores or residuals). Use this option when the biological interpretation of expression is directional (e.g., presence/absence) rather than fold-change.

Examples

if (FALSE) { # \dontrun{
# Basic usage with z-score
module_scores <- summarize_spillover_by_sample_and_module(expr_list, spillover_list, module_list,
                   normalize_scores = TRUE)

# Use absolute expression with custom summary
module_scores <- summarize_spillover_by_sample_and_module(expr_list, spillover_list, module_list,
                   abs_expression = TRUE,
                   summary_func = function(expr, spill) mean(expr * spill))
} # }