Skip to contents

Native R implementation of the WisecondorX newref pipeline. Takes a list of binned samples (as returned by bam_convert() or loaded from NPZ via reticulate) and builds a PCA-based reference suitable for rwisecondorx_predict().

Usage

rwisecondorx_newref(
  samples,
  binsize = 100000L,
  sample_binsizes = NULL,
  nipt = FALSE,
  refsize = 300L,
  yfrac = NULL,
  cpus = 1L
)

Arguments

samples

List of sample objects, each a named list of integer vectors keyed by chromosome ("1""24"), as returned by bam_convert(). At least 10 samples are required.

binsize

Integer; the target bin size in base pairs. All samples are rescaled to this size. Default 100000L.

sample_binsizes

Optional integer vector of per-sample bin sizes. If NULL (default), all samples are assumed to already be at binsize.

nipt

Logical; if TRUE, NIPT mode (no gender correction, no male gonosomal reference). Default FALSE.

refsize

Integer; number of reference bin locations per target bin. Default 300L.

yfrac

Optional numeric; manual Y-fraction cutoff for gender classification. If NULL (default), the cutoff is derived from a GMM.

cpus

Integer; number of threads for reference bin finding. Default 1L.

Value

A list (the reference object) with class "WisecondorXReference", containing autosomal and (optionally) gonosomal sub-references. See Details for the full structure.

Details

The pipeline trains a gender model (2-component GMM on Y-fractions), optionally applies gender correction for non-NIPT workflows, computes a global bin mask, then builds three sub-references: autosomes (A), female gonosomes (F), and male gonosomes (M). Each sub-reference includes PCA components, within-sample reference bin indices and distances, and null ratios for between-sample Z-scoring.

This is a faithful port of the upstream Python wisecondorx newref, crediting the original WisecondorX authors.

The returned reference object contains:

binsize

Integer; the reference bin size.

is_nipt

Logical; whether NIPT mode was used.

trained_cutoff

Numeric; Y-fraction gender cutoff.

has_female

Logical; whether a female gonosomal reference exists.

has_male

Logical; whether a male gonosomal reference exists.

mask, bins_per_chr, masked_bins_per_chr, masked_bins_per_chr_cum, pca_components, pca_mean, indexes, distances, null_ratios

Autosomal reference components.

mask.F, ..., null_ratios.F

Female gonosomal reference (if present).

mask.M, ..., null_ratios.M

Male gonosomal reference (if present).