FFI Boundary Semantics

This article is about what actually crosses the R/C boundary in Rtinycc:

when values are copied
when they are borrowed
when they stay as raw addresses
when wrappers allocate temporary storage

The statements below are based on the implemented wrapper generator and runtime helpers.

Scalar Inputs Are Converted

Scalar inputs are converted at the boundary. For example:

i8, i16, i32, u8, u16 use integer coercion plus range checks
i64, u32, u64 use numeric coercion plus integer-value checks
bool rejects NA
f32 and f64 are read from R numerics

So scalar arguments are not zero-copy views into R objects. They become C scalars inside the wrapper.

Vector Inputs Are Usually Borrowed

The array input types:

raw
integer_array
numeric_array
logical_array

are passed as writable direct pointers into the underlying R vector storage. For ordinary already-materialized vectors, no extra buffer is allocated by the wrapper.

That means:

C sees the existing vector data
mutation from C writes into the same memory region
the current array input types intentionally use R’s writable pointer access path because their C signatures receive mutable pointers

This is the main zero-copy part of the FFI boundary. One important R-specific caveat is ALTREP: if an input vector is an ALTREP object, asking R for a writable C data pointer can materialize that vector. Checking ALTREP(x) in generated C could choose a different policy, but using *_GET_REGION() into a temporary buffer would be a different contract for these mutable types: it would need copy-back behavior and would not automatically preserve pointer aliasing when the same R vector is passed to multiple C arguments. Read-only ALTREP-friendly array types would be a cleaner separate API.

`tcc_call_symbol()` Uses `.C()`-Style Copy-In/Copy-Out

When tcc_call_symbol() is called with extra arguments, it follows the argument-type mapping of R’s .C() interface rather than the zero-copy array contract used by tcc_ffi() wrappers.

For atomic vectors and character vectors, Rtinycc copies inputs into guarded mutable call storage and returns a list containing the copied-back values. This means C mutations are visible in the returned list, not by mutating the original R objects. Numeric vectors with attr(x, "Csingle") are converted through a temporary float * buffer, matching R’s legacy .C() convention.

Lists and other non-atomic R objects follow the legacy .C() read-only paths: lists are exposed as SEXP *, while functions, environments, and other R objects are exposed as SEXP. These values are borrowed only for the duration of the call. C code must not mutate them through tcc_call_symbol(), and if it stores a SEXP beyond the call it must preserve and later release it with the R C API. For ordinary lists, Rtinycc can pass the existing SEXP * element storage read-only; for ALTREP or otherwise opaque list-like vectors, it rebuilds a temporary call-lifetime SEXP * view with VECTOR_ELT() so the path remains ALTREP-aware without forcing writable data access.

Unlike R’s optional options(CBoundsCheck = TRUE), tcc_call_symbol() checks guard bytes around its copied atomic and character buffers by default. This can catch simple underwrites and overwrites, but it is not a sandbox: far out-of-bounds native writes are still bugs in the called C code. For character arguments, C may edit the contents of each copied string buffer in place, but it must not replace the char * elements in the char ** array.

`cstring_array` Is Rebuilt Per Call

cstring_array is different. The wrapper allocates a temporary const char ** with R_alloc() and fills it by translating each R string element.

So:

the pointer array itself is allocated for the call
each element points at translated string data
this is not the same as passing a pre-existing C array through unchanged

Returned Arrays Are Copied into Fresh R Vectors

Array returns are always copied into a newly allocated R vector. The wrapper uses the declared length_arg to size the R result, then memcpy() copies the returned C buffer into that vector.

If free = TRUE, the wrapper also frees the original returned buffer after the copy.

So array returns are not borrowed views into C memory.

Returned `cstring` Values Are Copied

For cstring returns, the wrapper creates an R string with mkString() when the returned pointer is non-NULL.

That means the resulting R value is a copy in R-managed memory, not a retained external pointer to the original C string.

Returned `ptr` Values Stay as Pointers

For ptr returns, the wrapper constructs an external pointer around the raw address.

That means:

no pointee copy is made
ownership is not implied
the pointer may dangle if the underlying C storage goes away

The same distinction matters for globals and struct fields.

`sexp` Passes Through Directly

sexp is the most direct boundary mode:

input sexp arguments are passed through as SEXP
returned sexp values are returned directly

This is useful when you want the R C API contract rather than the stricter FFI conversion layer.

Owned vs Borrowed Helper Pointers

At the helper level:

tcc_malloc() and tcc_cstring() create owned external pointers
tcc_data_ptr() and tcc_read_ptr() return borrowed external pointers
struct field address helpers and many raw pointer returns are borrowed views
named nested struct getters such as struct_outer_get_child() return borrowed nested views into the owning struct storage

Use tcc_ptr_is_owned() when you need to distinguish these cases in R code.

Bitfields Are Scalar Helpers, Not Addressable Views

Bitfield helpers behave like scalar getter/setter helpers at the R boundary, but that does not make them ordinary addressable fields.

In particular:

bitfield getters return copied scalar values
bitfield setters write scalar values back through the compiler-managed bitfield storage
tcc_field_addr() and tcc_container_of() reject bitfield members

So bitfields are intentionally excluded from the borrowed-address helper model.

Serialization Boundary

Compiled tcc_compiled objects store enough recipe information to recompile after serialize() / unserialize() or readRDS().

Raw pointers and raw tcc_state objects do not gain that behavior. After serialization they are just dead addresses or invalid states, not auto-reconstructed resources. The same applies to callback tokens, struct/union external pointers, and helper allocations from tcc_malloc() or tcc_cstring(): they do not serialize as live native resources.

Scalar Inputs Are Converted

Vector Inputs Are Usually Borrowed

tcc_call_symbol() Uses .C()-Style Copy-In/Copy-Out

cstring_array Is Rebuilt Per Call