Skip to contents

This article is about what actually crosses the R/C boundary in Rtinycc:

  • when values are copied
  • when they are borrowed
  • when they stay as raw addresses
  • when wrappers allocate temporary storage

The statements below are based on the implemented wrapper generator and runtime helpers.

Scalar Inputs Are Converted

Scalar inputs are converted at the boundary. For example:

  • i8, i16, i32, u8, u16 use integer coercion plus range checks
  • i64, u32, u64 use numeric coercion plus integer-value checks
  • bool rejects NA
  • f32 and f64 are read from R numerics

So scalar arguments are not zero-copy views into R objects. They become C scalars inside the wrapper.

Vector Inputs Are Usually Borrowed

The array input types:

  • raw
  • integer_array
  • numeric_array
  • logical_array

are passed as writable direct pointers into the underlying R vector storage. For ordinary already-materialized vectors, no extra buffer is allocated by the wrapper.

That means:

  • C sees the existing vector data
  • mutation from C writes into the same memory region
  • the current array input types intentionally use R’s writable pointer access path because their C signatures receive mutable pointers

This is the main zero-copy part of the FFI boundary. One important R-specific caveat is ALTREP: if an input vector is an ALTREP object, asking R for a writable C data pointer can materialize that vector. Checking ALTREP(x) in generated C could choose a different policy, but using *_GET_REGION() into a temporary buffer would be a different contract for these mutable types: it would need copy-back behavior and would not automatically preserve pointer aliasing when the same R vector is passed to multiple C arguments. Read-only ALTREP-friendly array types would be a cleaner separate API.

tcc_call_symbol() Uses .C()-Style Copy-In/Copy-Out

When tcc_call_symbol() is called with extra arguments, it follows the argument-type mapping of R’s .C() interface rather than the zero-copy array contract used by tcc_ffi() wrappers.

For atomic vectors and character vectors, Rtinycc copies inputs into guarded mutable call storage and returns a list containing the copied-back values. This means C mutations are visible in the returned list, not by mutating the original R objects. Numeric vectors with attr(x, "Csingle") are converted through a temporary float * buffer, matching R’s legacy .C() convention.

Lists and other non-atomic R objects follow the legacy .C() read-only paths: lists are exposed as SEXP *, while functions, environments, and other R objects are exposed as SEXP. These values are borrowed only for the duration of the call. C code must not mutate them through tcc_call_symbol(), and if it stores a SEXP beyond the call it must preserve and later release it with the R C API. For ordinary lists, Rtinycc can pass the existing SEXP * element storage read-only; for ALTREP or otherwise opaque list-like vectors, it rebuilds a temporary call-lifetime SEXP * view with VECTOR_ELT() so the path remains ALTREP-aware without forcing writable data access.

Unlike R’s optional options(CBoundsCheck = TRUE), tcc_call_symbol() checks guard bytes around its copied atomic and character buffers by default. This can catch simple underwrites and overwrites, but it is not a sandbox: far out-of-bounds native writes are still bugs in the called C code. For character arguments, C may edit the contents of each copied string buffer in place, but it must not replace the char * elements in the char ** array.

cstring_array Is Rebuilt Per Call

cstring_array is different. The wrapper allocates a temporary const char ** with R_alloc() and fills it by translating each R string element.

So:

  • the pointer array itself is allocated for the call
  • each element points at translated string data
  • this is not the same as passing a pre-existing C array through unchanged

Returned Arrays Are Copied into Fresh R Vectors

Array returns are always copied into a newly allocated R vector. The wrapper uses the declared length_arg to size the R result, then memcpy() copies the returned C buffer into that vector.

If free = TRUE, the wrapper also frees the original returned buffer after the copy.

So array returns are not borrowed views into C memory.

Returned cstring Values Are Copied

For cstring returns, the wrapper creates an R string with mkString() when the returned pointer is non-NULL.

That means the resulting R value is a copy in R-managed memory, not a retained external pointer to the original C string.

Returned ptr Values Stay as Pointers

For ptr returns, the wrapper constructs an external pointer around the raw address.

That means:

  • no pointee copy is made
  • ownership is not implied
  • the pointer may dangle if the underlying C storage goes away

The same distinction matters for globals and struct fields.

sexp Passes Through Directly

sexp is the most direct boundary mode:

  • input sexp arguments are passed through as SEXP
  • returned sexp values are returned directly

This is useful when you want the R C API contract rather than the stricter FFI conversion layer.

Owned vs Borrowed Helper Pointers

At the helper level:

  • tcc_malloc() and tcc_cstring() create owned external pointers
  • tcc_data_ptr() and tcc_read_ptr() return borrowed external pointers
  • struct field address helpers and many raw pointer returns are borrowed views
  • named nested struct getters such as struct_outer_get_child() return borrowed nested views into the owning struct storage

Use tcc_ptr_is_owned() when you need to distinguish these cases in R code.

Bitfields Are Scalar Helpers, Not Addressable Views

Bitfield helpers behave like scalar getter/setter helpers at the R boundary, but that does not make them ordinary addressable fields.

In particular:

  • bitfield getters return copied scalar values
  • bitfield setters write scalar values back through the compiler-managed bitfield storage
  • tcc_field_addr() and tcc_container_of() reject bitfield members

So bitfields are intentionally excluded from the borrowed-address helper model.

Serialization Boundary

Compiled tcc_compiled objects store enough recipe information to recompile after serialize() / unserialize() or readRDS().

Raw pointers and raw tcc_state objects do not gain that behavior. After serialization they are just dead addresses or invalid states, not auto-reconstructed resources. The same applies to callback tokens, struct/union external pointers, and helper allocations from tcc_malloc() or tcc_cstring(): they do not serialize as live native resources.