This article is about what actually crosses the R/C boundary in
Rtinycc:
- when values are copied
- when they are borrowed
- when they stay as raw addresses
- when wrappers allocate temporary storage
The statements below are based on the implemented wrapper generator and runtime helpers.
Scalar Inputs Are Converted
Scalar inputs are converted at the boundary. For example:
-
i8,i16,i32,u8,u16use integer coercion plus range checks -
i64,u32,u64use numeric coercion plus integer-value checks -
boolrejectsNA -
f32andf64are read from R numerics
So scalar arguments are not zero-copy views into R objects. They become C scalars inside the wrapper.
Vector Inputs Are Usually Borrowed
The array input types:
rawinteger_arraynumeric_arraylogical_array
are passed as writable direct pointers into the underlying R vector storage. For ordinary already-materialized vectors, no extra buffer is allocated by the wrapper.
That means:
- C sees the existing vector data
- mutation from C writes into the same memory region
- the current array input types intentionally use R’s writable pointer access path because their C signatures receive mutable pointers
This is the main zero-copy part of the FFI boundary. One important
R-specific caveat is ALTREP: if an input vector is an ALTREP object,
asking R for a writable C data pointer can materialize that vector.
Checking ALTREP(x) in generated C could choose a different
policy, but using *_GET_REGION() into a temporary buffer
would be a different contract for these mutable types: it would need
copy-back behavior and would not automatically preserve pointer aliasing
when the same R vector is passed to multiple C arguments. Read-only
ALTREP-friendly array types would be a cleaner separate API.
tcc_call_symbol() Uses .C()-Style
Copy-In/Copy-Out
When tcc_call_symbol() is called with extra arguments,
it follows the argument-type mapping of R’s .C() interface
rather than the zero-copy array contract used by tcc_ffi()
wrappers.
For atomic vectors and character vectors, Rtinycc copies inputs into
guarded mutable call storage and returns a list containing the
copied-back values. This means C mutations are visible in the returned
list, not by mutating the original R objects. Numeric vectors with
attr(x, "Csingle") are converted through a temporary
float * buffer, matching R’s legacy .C()
convention.
Lists and other non-atomic R objects follow the legacy
.C() read-only paths: lists are exposed as
SEXP *, while functions, environments, and other R objects
are exposed as SEXP. These values are borrowed only for the
duration of the call. C code must not mutate them through
tcc_call_symbol(), and if it stores a SEXP
beyond the call it must preserve and later release it with the R C API.
For ordinary lists, Rtinycc can pass the existing SEXP *
element storage read-only; for ALTREP or otherwise opaque list-like
vectors, it rebuilds a temporary call-lifetime SEXP * view
with VECTOR_ELT() so the path remains ALTREP-aware without
forcing writable data access.
Unlike R’s optional options(CBoundsCheck = TRUE),
tcc_call_symbol() checks guard bytes around its copied
atomic and character buffers by default. This can catch simple
underwrites and overwrites, but it is not a sandbox: far out-of-bounds
native writes are still bugs in the called C code. For character
arguments, C may edit the contents of each copied string buffer in
place, but it must not replace the char * elements in the
char ** array.
cstring_array Is Rebuilt Per Call
cstring_array is different. The wrapper allocates a
temporary const char ** with R_alloc() and
fills it by translating each R string element.
So:
- the pointer array itself is allocated for the call
- each element points at translated string data
- this is not the same as passing a pre-existing C array through unchanged
Returned Arrays Are Copied into Fresh R Vectors
Array returns are always copied into a newly allocated R vector. The
wrapper uses the declared length_arg to size the R result,
then memcpy() copies the returned C buffer into that
vector.
If free = TRUE, the wrapper also frees the original
returned buffer after the copy.
So array returns are not borrowed views into C memory.
Returned cstring Values Are Copied
For cstring returns, the wrapper creates an R string
with mkString() when the returned pointer is non-NULL.
That means the resulting R value is a copy in R-managed memory, not a retained external pointer to the original C string.
Returned ptr Values Stay as Pointers
For ptr returns, the wrapper constructs an external
pointer around the raw address.
That means:
- no pointee copy is made
- ownership is not implied
- the pointer may dangle if the underlying C storage goes away
The same distinction matters for globals and struct fields.
sexp Passes Through Directly
sexp is the most direct boundary mode:
- input
sexparguments are passed through asSEXP - returned
sexpvalues are returned directly
This is useful when you want the R C API contract rather than the stricter FFI conversion layer.
Owned vs Borrowed Helper Pointers
At the helper level:
-
tcc_malloc()andtcc_cstring()create owned external pointers -
tcc_data_ptr()andtcc_read_ptr()return borrowed external pointers - struct field address helpers and many raw pointer returns are borrowed views
- named nested struct getters such as
struct_outer_get_child()return borrowed nested views into the owning struct storage
Use tcc_ptr_is_owned() when you need to distinguish
these cases in R code.
Bitfields Are Scalar Helpers, Not Addressable Views
Bitfield helpers behave like scalar getter/setter helpers at the R boundary, but that does not make them ordinary addressable fields.
In particular:
- bitfield getters return copied scalar values
- bitfield setters write scalar values back through the compiler-managed bitfield storage
-
tcc_field_addr()andtcc_container_of()reject bitfield members
So bitfields are intentionally excluded from the borrowed-address helper model.
Serialization Boundary
Compiled tcc_compiled objects store enough recipe
information to recompile after serialize() /
unserialize() or readRDS().
Raw pointers and raw tcc_state objects do not gain that
behavior. After serialization they are just dead addresses or invalid
states, not auto-reconstructed resources. The same applies to callback
tokens, struct/union external pointers, and helper allocations from
tcc_malloc() or tcc_cstring(): they do not
serialize as live native resources.