Skip to contents

Fits a two-component Gaussian mixture model (GMM) on Y-unique region read ratios, which are computed from BAM files by nipter_y_unique_ratio.

Usage

nipter_sex_model_y_unique(ratios)

Arguments

ratios

Named numeric vector of Y-unique ratios (one per sample). Typically obtained by calling nipter_y_unique_ratio on each BAM in the control cohort.

Value

An object of class "NIPTeRSexModel" with elements:

model

The mclust::Mclust fitted object.

method

"y_unique".

male_cluster

Integer (1 or 2); which cluster is male (higher Y-unique ratio).

classifications

Named character vector of "male"/ "female" labels for each input sample.

fractions

The input ratios vector (named).

Details

This is a companion to nipter_sex_model, which operates on binned NIPTeRSample fractions. The Y-unique model operates at the BAM level and does not require prior binning. The resulting model object is compatible with nipter_predict_sex for majority-vote consensus.

The algorithm:

  1. Fit a two-component Gaussian mixture with equal mixing proportions (mclust::Mclust(ratios, G = 2, control = mclust::emControl(equalPro = TRUE))).

  2. Identify the male cluster as the component with the higher median Y-unique ratio.

Examples

if (FALSE) { # \dontrun{
# Compute Y-unique ratios for a cohort
bams <- list.files("bams/", pattern = "\\.bam$", full.names = TRUE)
ratios <- vapply(bams, function(b) nipter_y_unique_ratio(b)$ratio,
                 numeric(1L))
names(ratios) <- basename(bams)

# Build sex model
model_yu <- nipter_sex_model_y_unique(ratios)
model_yu$classifications
} # }