NomiScript Design Notes

Table of Contents

Goals

  • Build a Lisp dialect closer to Scheme/Common Lisp:
    • high allocation rate (pairs/lists), closures, symbols, dynamic types
    • proper tail calls (language-level), macros (eventually), exceptions/conditions (not sure)
  • Primary deployment:
    • run as a script language embedded into nomisync
    • execution target is WebAssembly running under wasmtime
  • FFI requirement (primary):
    • pass lists of strings between Rust (host) and Lisp (guest)
    • ideally: Rust uses Vec<String>, Lisp uses (list-of string) or (vector-of string)
  • Use `WasmGC` so the engine provides GC-managed heap objects rather than linear-memory GC.
  • Try to maintain compatibility with the nomisync script interface.

Non-goals (initially)

  • Full Common Lisp MOP
  • Native codegen (keep architecture compatible if possible)
  • Perfect cross-language object identity across the boundary (use handles)

Key Constraints & Reality Check

  • WasmGC is supported, but interop conventions are still evolving:
    • Design the FFI boundary carefully to avoid depending on unstable GC-interop details
  • Immediate practical implication:
    • Keep the exported ABI conservative and stable:
      • lists of strings, records of strings/numbers, byte arrays
      • avoid exporting raw GC references across the host boundary in v1
  • Support types from the finance crate across the border

High-level Architecture

Components

  • Frontend (new crate)
    • reader/parser -> AST
    • macro expansion (optional in v1; can start with minimal macros)
    • compiler to a compact internal IR/bytecode
  • Execution Engine in scripting crate
    • bytecode VM (recommended for v1)
    • supports closures, lexical environments, tail calls
  • Runtime Object Model
    • implemented using WasmGC types (struct/array)
    • major types:
      • Pair/Cons cell
      • String (backed by wasm string or array of bytes depending on chosen string representation)
      • Symbol
      • Vector
      • Closure
      • Numbers (immediate tagged or boxed)
        • Integers
        • Fractions
        • Decimals as syntax sugar, converted to fractions
  • Host Integration
    • provide imports (host functions) for I/O, logging
    • provide API for finance entities.

Wasm Packaging Choice

  • Check the Wasmtime’s component model for typed interop
  • If component model + WasmGC typed interop is not stable:
    • fallback to a minimal core-wasm ABI using explicit lowering into linear memory
    • keep the design such that we can switch to component model canonical ABI later without breaking the Lisp part

Object Representation with WasmGC

Primary design

  • Use GC-managed references for heap objects:
    • (ref null $Pair), (ref null $Vector), (ref null $Closure), etc.
  • Define GC struct types:
    • Pair: fields = car: ValueRef, cdr: ValueRef
    • Closure: code-id + env ref
    • Symbol: name string + interned id
  • Define a unified Value representation:
    • A variant-like tagged union with a small number of ref types plus immediates
    • If won't work, box everything
  • Immediates:
    • fixnum as i32/i64 with tagging if possible
    • native fractions support
    • booleans/nil as singletons
  • Strings:
    • Try wasm stringref
    • If won't work just represent as GC array<u8> + length.

Invariants

  • Guest never exposes internal GC object references directly to the host boundary.
  • Guest can safely move/collect objects without affecting host, since host only sees value-level data (strings, lists of strings, DTOs).

Runtime Codegen (Dual-Mode Compilation)

The compiler defaults to runtime WASM codegen with compile-time constant folding as an optimization. Every expression follows:

  1. Try to evaluate all operands at compile time (eval_value)
  2. If all succeed -> constant-fold (emit constant WASM values)
  3. If any operand is runtime -> emit WASM instructions

Type System

Two strict runtime types, no mixing:

  • I32: entity indices, booleans, counts, flags. Used by structural accessors (entity-count, entity-type, primary-entity-idx).
  • Ratio: (ref $ratio) WasmGC struct with i64 numerator + i64 denominator. CL-style auto-GCD-reduced. Used by all financial values (split-value, timestamps, arithmetic results).

No floats. Arithmetic operates only on Ratio; passing I32 to arithmetic is a compile error. Comparisons are type-aware: Ratio uses cross-multiplication, I32 uses i32.eq~/~i32.lt_s.

WasmGC Types

Emitted into every module:

  • $ratio — struct with i64 num + i64 denom, plus helper functions ($gcd, $ratio_new, $ratio_add/sub/mul/div, $ratio_eq/lt/gt/le/ge)
  • $cons — struct with i32 car + (ref null $cons) cdr, for runtime lists
  • $i8_array — byte array for strings (GC-managed)

Entity API

Context queries, entity accessors, and output functions compile to WASM memory reads/writes against the binary format. The OutputSerializer tracks cursor position, entity count, and string offsets with anodized design-by-contract specs.

should-apply

User-defined trigger function compiled to a separate WASM export. If defined via (defun should-apply () ...), the body is compiled to WASM; otherwise defaults to i32.const 1.

Tail Calls & Scheme Semantics

  • Strategy:
    • Attempt to reuse wasmtime tail-call support
    • If doesn't work, implement tail calls in the VM by reusing the current frame (trampoline / loop-based dispatch)