
Extract top marker genes from ranked marker tables
Source:R/method-L3_subcluster.R
extract_top_markers.RdGenerates a list of marker genes per cluster using ranked marker tables
produced by rank_cluster_markers.
Usage
extract_top_markers(
sce = NULL,
ranked_markers_key = "broad_cluster_markers",
ranked_markers = NULL,
fdr_threshold = 0.05,
effect_threshold = 0.6,
target_n = 100L
)Arguments
- sce
A
SingleCellExperiment. Required only whenranked_markersis not supplied.- ranked_markers_key
Character scalar. Metadata entry containing ranked marker tables. Defaults to
"broad_cluster_markers".- ranked_markers
List or
NULL. Ranked marker result returned byrank_cluster_markers(return_list = TRUE). If supplied, this takes precedence oversceandranked_markers_key.- fdr_threshold
Numeric scalar. Maximum FDR threshold. Defaults to
0.05.- effect_threshold
Numeric scalar. Minimum effect size threshold. Defaults to
0.6.- target_n
Integer scalar. Number of genes to return per cluster. Defaults to
100L.
Value
A named list with one entry per cluster, each containing:
top_nCharacter vector of up to
target_ngenes, passing genes first, then supplemented backfill.supplementedCharacter vector of backfill genes that did not meet the FDR/effect thresholds. Empty character vector if all
target_nslots were filled by passing genes.
Details
Ranked marker tables may be supplied either:
directly via the
ranked_markersargument, orindirectly via
metadata(sce)[[ranked_markers_key]].
If ranked_markers is supplied, it takes precedence and sce is
only used for class consistency. In that case, ranked_markers_key is
ignored.
Genes are first ordered by:
ascending
FDR, anddescending effect size (AUC for Wilcoxon, logFC for t-tests).
Marker genes passing:
FDR <= fdr_threshold, andeffect size >= effect_threshold
are prioritised. The final output for each cluster consists of exactly
target_n genes: passing genes are included first, then remaining
slots are filled with top-ranked genes that did not pass thresholds.
This ensures a consistent gene set size across clusters while prioritising high-confidence markers for downstream enrichment analysis.