Read a WisecondorX-format BED file into a sample list

Reads a 4-column bgzipped BED file (as written by bam_convert_bed()) and returns a named list of integer vectors suitable for rwisecondorx_newref(), rwisecondorx_predict(), scale_sample(), or bam_convert_npz().

Usage

bed_to_sample(bed, binsize = NULL, con = NULL)

Arguments

bed: Path to a bgzipped (or plain) BED file with a .tbi index.
binsize: Optional integer; bin size in base pairs. If NULL (default), inferred from the first row of the BED file.
con: Optional open DBI connection with duckhts already loaded.

Value

A named list with one integer vector per chromosome key ("1"–"22", "23" for X, "24" for Y). Each vector has length max_bin + 1 for that chromosome. Chromosomes absent from the BED file are NULL. This is the same format returned by bam_convert().

Details

The BED file must have 4 tab-delimited columns: chrom, start, end, count (no header). Coordinates are 0-based half-open intervals. The bin size is inferred from the first row (end - start) unless explicitly provided.

Examples

if (FALSE) { # \dontrun{
# Write bin counts to BED, then read them back
bam_convert_bed("sample.bam", "sample.bed.gz", binsize = 5000L)
bins <- bed_to_sample("sample.bed.gz")

# Use directly with the native WisecondorX pipeline
samples <- lapply(bed_files, bed_to_sample)
ref <- rwisecondorx_newref(samples, binsize = 100000L, nipt = TRUE)
} # }