Architecture

Compiler-guided runtime patching

SPSLR is a pipeline. Compiler instrumentation preserves field-offset dependencies, patchcompile turns them into runtime descriptors, and selfpatch applies a randomized layout to code and data at startup.

Terminology

Subject

A binary entity participating in layout randomization and SPSLR patching.

Host Subject

The primary SPSLR subject that owns the runtime state and target namespace.

Module Subject

A dynamically loaded subject that attaches to an already initialized host subject.

Target

A structure type selected for layout randomization.

Instruction Pin

A patch site for a machine instruction immediate that encodes a field offset.

Data Pin

A patch site for static data in a subject's image that contains an instance of a target.

Pipeline

The SPSLR pipeline starts with ordinary C sources that annotate target structures with the compiler directive __attribute__((spslr)). Fields that should not be randomized must be marked with __attribute__((spslr_field_fixed)). The pinpoint compiler plugin identifies and instruments target field accesses, as well as static occurrences of target instances in individual compilation units. In each case, it logs metadata, such as which field an instruction accesses, and dumps it in an accompanying [source].c.spslr file.

After all compilation units have been compiled and thus all patch sites have been pinpointed, the patchcompile command line tool collects the SPSLR metadata from all produced .spslr files. It generates a globally unified set of targets and maps patch site information to an efficient runtime representation. The results are compiled into a single [subject]_spslr_section.S assembly file which can further be assembled into [subject]_spslr_section.o.

The runtime patcher, selfpatch, comes as its own set of C source files. For the sake of simplicity, this documentation will synonymously refer to them as just selfpatch.c. It is compiled without the pinpoint plugin and, together with all subject objects and [subject]_spslr_section.o, linked into the host subject binary. A small public API provides hooks for randomizing the target layouts and patching host or module subject images in memory.

SPSLR currently requires a small GCC patch. The pinpoint plugin must observe certain offsetof-like field-offset expressions before the GCC C parser folds them into compile-time constants. Because the earliest mainline plugin hooks execute after this folding step, SPSLR adds a lightweight parser hook that exposes these expressions to pinpoint before they are simplified.

Aside from this parser hook, SPSLR remains implemented as ordinary tooling around GCC. Instructions for building the SPSLR toolchain, applying the GCC patch, and compiling SPSLR subjects are provided in the project README.

Worked build-flow overview: click a source file, metadata file, or connector to see the concrete artifact or command used at that step.
host subject build module subject build selfpatch.c no pinpoint selfpatch.o a.c struct A + ipin b.c struct A + dpin main.c struct B + main a.o a.c.spslr b.o b.c.spslr main.o main.c.spslr patchcompile host metadata host_spslr_section.S host_spslr_section.o host.spslr_targets host subject binary module.c uses struct A module.c.spslr module.o patchcompile module metadata module_spslr_section.S module_spslr_section.o module.so module subject

Pinpoint: compiler instrumentation

pinpoint is the compiler instrumentation component of SPSLR. Its task is to identify where generated code or static data depends on structure field offsets and to expose those dependencies as SPSLR patch metadata.

During compilation, pinpoint discovers randomized targets and constructs a compilation-unit-local target model. Each target receives a local target ID and a description of its field layout, including byte offsets, sizes, alignment, and field attributes such as spslr_field_fixed.

Instruction pins

An instruction pin (ipin) represents a machine instruction whose immediate operand encodes a structure field offset. Pinpoint transforms field accesses so the original target and field-offset relationship remains visible throughout compilation and can later be rewritten after runtime layout randomization.

Conceptually, a field access:

obj->field

becomes an access using an SPSLR separator value:

*(typeof(field) *)
((char *)obj + separator(target_id, field_offset))

Later, the separator is lowered into a real machine instruction whose immediate bytes are named by an assembler symbol and represent the field offset. Those bytes become the actual instruction pin patched by selfpatch.

For each instruction pin, pinpoint records:

  • the assembler symbol naming the instruction immediate bytes,
  • the local target ID,
  • the original field offset inside the target,
  • and the immediate width to be patched.
.globl spslr_a_ipin0
.hidden spslr_a_ipin0

; movq imm32, %rax
.byte 0x48, 0xc7, 0xc0

; the labeled imm32 value
spslr_a_ipin0:
.type spslr_a_ipin0, @object
.size spslr_a_ipin0, 4
.long 0x8

Conceptually, SPSLR aims to preserve field-offset dependencies during compilation without imposing runtime overhead after patching has completed. Once selfpatch rewrites instruction pin immediates to reflect the selected randomized layout, field accesses execute as ordinary machine instructions using concrete offsets.

The current implementation favors implementation simplicity over perfectly zero-cost code generation. Rather than encoding field offsets directly inside memory-access instructions, pinpoint lowers instruction pins into a dedicated offset materialization step followed by the actual access instruction:

mov $field_offset, %rax
mov (%rdi,%rax,1), %rax

This introduces one additional instruction compared to a direct displacement-based field access. The separation keeps instruction pin patch sites simple and stable, because selfpatch only needs to rewrite the immediate bytes of a dedicated materialization instruction rather than reason about architecture-specific addressing forms.

A future pinpoint constant-folding pass may eliminate this overhead by folding patched offsets back into surrounding instructions, restoring the conceptual zero-runtime-overhead execution model in which randomized offsets behave as if they had been compiled into the program directly.

Data pins

A data pin (dpin) represents existing storage that already contains one or more target instances in their original layout. Pinpoint discovers such objects during compilation and records their location and structure.

Static objects may contain targets directly, arrays of targets, or nested target instances embedded inside larger aggregates. For each discovered target instance, pinpoint records:

  • (an alias for) the containing object symbol,
  • the byte offset of the target instance,
  • the local target ID,
  • and structural nesting information.

These descriptors later allow selfpatch to rewrite static storage from the original layout into the randomized layout chosen at runtime.

struct Outer {
    int flags;
    struct Target t;
};

static struct Outer obj;

In this example, pinpoint emits a data pin describing that obj contains a target instance at the offset of Outer::t. Structural nesting information is preserved for later runtime patch ordering.

Compilation-unit metadata

At the end of compilation, pinpoint emits all collected target layouts, instruction pins, and data pins into a textual .spslr metadata file associated with the compilation unit. Target IDs remain local to the compilation unit and are unified later by patchcompile.

Patchcompile: metadata consolidation

patchcompile transforms compilation-unit-local SPSLR metadata into one unified runtime representation. It consumes all .spslr files produced by pinpoint and produces a compact descriptor section that can later be linked into a subject binary.

The most important task of patchcompile is constructing a global target namespace. Each compilation unit uses its own local target IDs, so patchcompile compares discovered target layouts and merges compatible targets into one shared runtime representation.

Conceptually, several compilation units:

foo.c.spslr:
target 0 = struct Credentials

bar.c.spslr:
target 0 = struct Credentials

become one shared runtime target:

global target 4 = struct Credentials

Instruction pins and data pins are rewritten to reference this global target namespace.

Instruction pin programs

Patchcompile converts pinpoint instruction pins into a compact runtime representation called an instruction pin program. Rather than storing a single replacement offset, the program describes how the final instruction immediate should be computed after runtime layout randomization.

The common case is a simple one-field dependency:

patch immediate =
      randomized_offset(target, field)

However, instruction pin programs are intentionally more general than this canonical case. A future implementation may produce ipin instructions whose immediate value constitutes folded expressions that depend on several randomized field offsets. The program abstraction allows arbitrary patch-value computations to be represented without changing the runtime model.

Data pin descriptors

Patchcompile also transforms data pins into runtime descriptors for static storage rewriting. Each descriptor identifies one target instance already present in a subject image.

Nested target instances are sorted by nesting level. Embedded targets appear before their containing objects so that rewriting proceeds from inner objects outward.

struct Outer {
    int flags;
    struct Target t;
};

static struct Outer obj;

Here, patchcompile emits a descriptor for obj + offsetof(Outer, t). If nested targets exist, deeper nesting levels appear first.

Runtime descriptor section

The final output of patchcompile is an assembly source file containing SPSLR runtime descriptors. After assembly and linking, the subject binary contains a metadata section describing targets, instruction pins, instruction pin programs, and data pins.

[subject]_spslr_section.S
        ↓
[subject]_spslr_section.o
        ↓
linked into subject image

Selfpatch: runtime application

selfpatch applies SPSLR at runtime. It consumes the runtime descriptor section produced by patchcompile, chooses randomized target layouts, computes new field offsets, rewrites static storage, and patches instruction immediates.

Runtime layout randomization is performed per target. For each randomized target, selfpatch computes a new field order while respecting field size, alignment, and fixed-field constraints.

The resulting runtime layout becomes authoritative for later patching. Instruction pin programs and data pins both resolve through the same randomized target description.

Public runtime API

The public selfpatch API consists of three entry points: spslr_init(), spslr_selfpatch(), and spslr_patch_module(). Together, these functions separate layout selection from image rewriting.

struct spslr_status spslr_init(void);
struct spslr_status spslr_selfpatch(void);
struct spslr_status spslr_patch_module(const struct spslr_module *m);

spslr_init() initializes the SPSLR runtime and chooses the randomized layouts for the host subject. At this point, the randomized layouts are known to the runtime, but the executable image has not yet been rewritten. Code and static data still refer to the original compile-time layout.

spslr_selfpatch() applies the selected layouts to the host executable. It patches the host's data pins and instruction pins and must be called after spslr_init(), but before ordinary program execution begins to rely on randomized target objects.

The intended startup sequence is:

int main(void) {
    struct spslr_status st;

    st = spslr_init();
    if (st.error != SPSLR_OK)
        return 1;

    st = spslr_selfpatch();
    if (st.error != SPSLR_OK)
        return 1;

    /* Normal program execution starts here. */
    return real_main();
}

Under contextual constraints, the image selfpatch step may be deferred. Dynamically created target instances may exist before spslr_selfpatch() only if they are fully dead before the patching boundary. They must not be used after patching, because stack and heap instances are not represented by data pins and are therefore not transformed into the randomized layout.

int main(void) {
    spslr_init();

    {
        /* Ok: this instance does not cross the patching boundary. */
        struct target temporary_target;
        temporary_target.field = 42;
    }

    /* Ok: this instance does not cross the patching boundary. */
    struct target *heap_target = malloc(sizeof(*heap_target));
    heap_target->field = 42;
    free(heap_target);

    spslr_selfpatch();

    /* Ok: new dynamic target instances are constructed after patching. */
    struct target patched_stack_target;
    patched_stack_target.field = 42;
}
int main(void) {
    spslr_init();

    /* Bad: this target instance survives across the patching boundary. */
    struct target stack_target;
    stack_target.field = 42;

    struct target *heap_target = malloc(sizeof(*heap_target));
    heap_target->field = 42;

    spslr_selfpatch();

    /*
     * Bad: neither instance was shuffled by selfpatch, so these accesses
     * interpret old-layout storage using the new field offsets.
     */
    int a = stack_target.field;
    int b = heap_target->field;

    free(heap_target);
}

Static target instances are different. Because they are part of the program image, SPSLR rewrites them during selfpatch through data pin patching. Reads and writes performed before and after patching are therefore permitted, provided that no pointers materialized from compile-time field offsets are preserved across the patching boundary.

static struct target obj;

int main(void) {
    spslr_init();

    /* Ok: static target instances may be accessed before patching. */
    obj.fieldA = 42;

    /* Ok: pointer usage does not cross patching boundary. */
    int *fieldB_ptr = &obj.fieldB;
    *fieldB_ptr = 41;

    spslr_selfpatch();

    /* Bad: fieldB_ptr still points to the field's pre-patch location. */
    *fieldB_ptr = 42;

    /* Ok: pointer assignment now goes through patched code and randomized offsets. */
    fieldB_ptr = &obj.fieldB;
    *fieldB_ptr = 42;
}

spslr_patch_module() patches a separately loaded module against the layouts already selected by the host. Modules do not choose independent layouts for shared SPSLR targets. Instead, a module provides its local SPSLR descriptor symbols, and selfpatch rewrites that module's data pins and instruction pins to match the host layout.

A module must be patched after it has been loaded and relocated, but before any SPSLR-instrumented code or randomized static target objects from that module are used.

All three functions return a struct spslr_status. The error field describes the immediate failure reason, while the viability field indicates whether the affected executable or module may continue running safely. If patching fails with a non-viable status, the image may have been partially modified and must not continue execution.

Instruction pin patching

Instruction pin patching updates machine instructions whose immediates encode original structure offsets. For each instruction pin, selfpatch evaluates the associated instruction pin program and overwrites the immediate bytes with the newly computed offset.

before:
mov $0x18, %rax
mov (%rdi,%rax,1), %rax

after:
mov $0x30, %rax
mov (%rdi,%rax,1), %rax

Only the immediate field changes. The surrounding instruction bytes remain fixed.

Because instruction pins commonly reside in executable .text mappings, selfpatch must temporarily obtain write permission for the corresponding memory region before applying the patch and later restore the original protection state.

Data pin patching

Data pin patching rewrites existing static storage from original layout into the randomized runtime layout. A temporary buffer is used while copying fields so overlapping source and destination regions remain safe.

Depending on placement, data pins may also require writable access to memory that is normally read-only, for example statically initialized objects placed in read-only sections. Selfpatch therefore requires sufficient permissions to temporarily modify the corresponding storage.

Host and module subjects

SPSLR distinguishes between a host subject and module subjects.

The host subject owns the SPSLR runtime implementation. It contains the selfpatch code, initializes runtime randomization, owns the global target namespace, and stores the randomized target layouts used for patching.

Module subjects are intentionally smaller. Rather than carrying their own target descriptions or randomization implementation, modules only contain the reduced metadata required to patch their own image, such as instruction pins, corresponding instruction pin programs, and data pins. Their pins refer to global targets already owned by the host subject.

host subject
    ├── selfpatch runtime
    ├── global targets
    ├── randomized layouts
    ├── host-image data pins
    ├── host-image instruction pins
    └── host-image instruction pin programs

module subject
    ├── module-image data pins
    ├── module-image instruction pins
    └── module-image instruction pin programs

When a module subject is loaded, selfpatch reuses the already initialized host state and applies SPSLR patching using the host subject's randomized layouts.

Limitations and assumptions

SPSLR is currently a research prototype and operates under explicit assumptions about compiler behavior, runtime patchability, and layout ownership. These assumptions define the correctness boundary of the implementation and document the cases that currently require exclusion or additional care.

Compiler and code-generation assumptions

  • Field accesses must remain observable and transformable by pinpoint. Code generation patterns not recognized by pinpoint cannot be randomized safely.
  • Inline assembly must not encode assumptions about randomized target layouts unless explicitly excluded from SPSLR or manually synchronized with runtime patching.
  • The current implementation assumes predictable instruction pin lowering so that runtime patch sites remain identifiable and patchable.

Current pinpoint implementation limitations

The current pinpoint implementation intentionally preserves randomized field-offset dependencies until runtime patching. As a consequence, randomized field offsets cease to behave as compile-time knowledge in some contexts. Language constructs that require compile-time field offsets may therefore become invalid under SPSLR instrumentation.

In particular, static initialization involving pointers to randomized fields is currently unsupported:

struct target obj;

/* Fails to compile under SPSLR instrumentation (expression is non-const). */
int *field_ptr = &obj.field;

Fields that trigger compiler errors because of this restriction should currently be marked with __attribute__((spslr_field_fixed)). Pinpoint's separation transformations are not applied to fixed field references which thus do not lose their compile-time evaluability.

Runtime and patching assumptions

  • SPSLR assumes writable access can temporarily be obtained for executable and read-only mappings that contain instruction pins or static target instances.
  • Dynamically allocated target instances must not survive across the patching boundary. Stack and heap target instances are not rewritten by selfpatch and must therefore either be destroyed before patching or only be created after selfpatching has completed.
  • Pointers materialized from compile-time field offsets must not survive across the patching boundary. Addresses into target objects may become invalid after randomized layouts are applied.

Layout and ABI assumptions

  • Layout-sensitive language constructs such as bitfields, compiler-specific packing behavior, or unusual layout rules may require exclusion from SPSLR.
  • Externally visible binary interfaces that expose randomized target layouts require special care. Shared ABI structures may require fixed fields or exclusion from layout randomization.
  • Targets shared across host and module subjects are assumed to describe compatible semantic layouts so patchcompile can unify them into one global runtime target namespace.

These limitations are intentional constraints of the current prototype and document the assumptions under which SPSLR remains correct.