Chi-squared correction for overdispersed bins
nipter_chi_correct.RdIdentifies bins in the control group with excess variance (overdispersion) using a chi-squared test and downweights them. This reduces the influence of noisy bins on downstream Z-scores and NCV calculations.
Arguments
- sample
A
NIPTeRSampleobject (the test sample).- control_group
A
NIPTeRControlGroupobject.- chi_cutoff
Normalised chi-squared threshold. Bins with
(chi - df) / sqrt(2*df) > chi_cutoffare corrected. Default3.5.- include_sex
Logical; correct sex chromosomes as well? Default
FALSE. Sex chromosome bins use the same chi-squared weights derived from autosomes.
Value
A list with two elements:
- sample
The corrected
NIPTeRSample.- control_group
The corrected
NIPTeRControlGroup.
Details
The correction is applied simultaneously to both the test sample and all control group samples, maintaining consistency.
The algorithm follows NIPTeR's chi-squared correction:
For each control sample, scale autosomal bin counts so that total reads match the overall mean across all control samples.
Compute the expected count per bin (mean of scaled counts).
Compute chi-squared per bin: \(\chi^2 = \sum_i (expected - scaled_i)^2 / expected\).
Normalise: \(z = (\chi^2 - df) / \sqrt{2 \cdot df}\) where \(df = n_{controls} - 1\).
For bins where \(z > \text{chi\_cutoff}\): divide reads by \(\chi^2 / df\).