Search CRISPR guide targets — sassy

sassy_crispr() is an R-level equivalent of the upstream sassy crispr workflow for in-memory sequences. Guides include the PAM at the end. By default, the PAM must match exactly under IUPAC matching, while the rest of the guide may have up to k edits.

Usage

sassy_crispr(
  guide,
  text,
  k,
  pam_length = 3L,
  allow_pam_edits = FALSE,
  max_n_frac = 0.2,
  rc = TRUE,
  threads = 1L,
  pattern_id = NULL,
  text_id = NULL
)

Arguments

guide: List of guide sequences including the PAM suffix. Each element must be a raw vector or non-missing character scalar.
text: List of text sequences to search. Each element must be a raw vector or non-missing character scalar.
k: Maximum edit distance for the searched guide sequence. With allow_pam_edits = FALSE, the exact-PAM filter means this is effectively the edit threshold outside the PAM.
pam_length: Length of the PAM suffix.
allow_pam_edits: If TRUE, do not require an exact PAM match.
max_n_frac: Maximum allowed fraction of N bases in match_region.
rc: If TRUE, search reverse-complement targets as well.
threads: Number of worker threads to request.
pattern_id: Optional guide/pattern identifiers. If supplied, must be a character vector with one entry per guide and adds/replaces a pattern_id column. Names on guide are not inspected.
text_id: Optional text identifiers. If supplied, must be a character vector with one entry per text and adds/replaces a text_id column. Names on text are not inspected.

Value

A data frame with CLI-style columns: guide, cost, strand, start, end, match_region, and cigar. If pattern_id or text_id are supplied, mapped identifier columns are included.

Examples

sassy_crispr(list("ACGTNGG"), list("TTTACGTAGGTTT"), k = 0, rc = FALSE, text_id = "chr1")
#>     guide text_id cost strand start end match_region cigar
#> 1 ACGTNGG    chr1    0      +     3  10      ACGTAGG    7=