Load a NIPTeR control group from a directory of TSV.bgz files
nipter_control_group_from_beds.RdReads all .bed.gz (or .tsv.bgz) files in bed_dir using
rduckhts_tabix_multi() — a single multi-file DuckDB scan — and
constructs a NIPTeRControlGroup from the results. This is much faster
than lapply(files, bed_to_nipter_sample) for large cohorts because all
files are read in one pass.
Usage
nipter_control_group_from_beds(
bed_dir,
pattern = "*.bed.gz",
binsize = NULL,
description = "General control group",
con = NULL
)Arguments
- bed_dir
Character; directory containing one
.bed.gzor.tsv.bgzfile per control sample, each produced bynipter_bin_bam_bed.- pattern
Glob pattern for file discovery (default
"*.bed.gz").- binsize
Optional integer; bin size in base pairs. If
NULL(default), inferred from the first row of the first file.- description
Label for the resulting control group (default
"General control group").- con
Optional open DBI connection with duckhts loaded.
Examples
if (FALSE) { # \dontrun{
# Bin all controls to BED once
for (bam in bam_files) {
nipter_bin_bam_bed(bam, file.path("controls/", sub(".bam$", ".bed.gz", basename(bam))))
}
# Load them all at once
cg <- nipter_control_group_from_beds("controls/")
} # }