This document describes an intermediate representation format for uFork modules. It encodes an abstract syntax tree (AST) that describes the logical structure of a uFork module. Symbolic named-references abstract away specific quad-memory addresses, allowing the linker/loader to layout the memory image however it sees fit. Module dependencies are loaded recursively. Symbolic names designate references to values from imported modules. Only exported symbols are visible outside of a module.
Each CRLF object describes a single module.
The top-level ast
is a module object.
{
"lang": "uFork",
"ast": <module>
}
A module describes its imports, its definitions, and its exports.
{
"kind": "module",
"import": {
<name>: <src>,
<name>: <src>,
...
},
"define": {
<name>: <value>,
<name>: <value>,
...
},
"export": [
<name>,
<name>,
...
]
}
Each import consists of a name and src string. The import's name is private to the module, and can be any string whatsoever. The src tells the loader where to find the module. The format of src strings depends on the system.
Each definition consists of a name and a value. The name can be any string. The value is a literal, fixnum, type, pair, dict, instr, or ref.
Definitions are private by default. Only definitions whose names are included in
the export
array will be accessible by other modules.
In the following example, URLs and relative paths are used as src strings. The
apple
definition remains private whereas the orange
definition is
exported.
{
"kind": "module",
"import": {
"std": "https://ufork.org/lib/std.json",
"pear": "./pear.json"
},
"define": {
"apple": ...,
"orange": ...
},
"export": [
"orange"
]
}
Anywhere a value is expected, a ref may be provided instead. A ref refers to a value by name.
{
"kind": "ref",
"module": <name>,
"name": <name>
}
If the module
property is specified, the ref refers to an imported value.
Otherwise it refers to one of the definitions. For example,
{
"kind": "ref",
"module": "std",
"name": "commit"
}
refers to the "std" module's commit
export, and
{
"kind": "ref",
"name": "apple"
}
refers to the apple
definition.
There are four literal values: undef, nil, true, and false.
{"kind": "literal", "value": "undef"}
undef represents the absence of a value.
{"kind": "literal", "value": "nil"}
nil is used to terminate recursive data structures, for example a list of pairs.
{"kind": "literal", "value": "true"}
{"kind": "literal", "value": "false"}
The boolean values are true and false.
A fixnum is a signed integer that fits in a machine word. Some valid fixnum
values are 200
, -3
and 0
.
A pair is a data structure containing two values.
{
"kind": "pair",
"head": <value>,
"tail": <value>
}
Lists can be built out of pairs. For example,
{
"kind": "pair",
"head": 4,
"tail": {
"kind": "pair",
"head": 5,
"tail": {
"kind": "literal",
"value": "nil"
}
}
}
is a list containing 4 and 5.
A dict is a data structure used to build dictionaries. Each dict has a key, a value, and another dict (or nil).
{
"kind": "dict",
"key": <value>,
"value": <value>,
"next": <dict/nil>
}
For example,
{
"kind": "dict",
"key": 0,
"value": {
"kind": "literal",
"value": "false"
},
"next": {
"kind": "dict",
"key": {
"kind": "literal",
"value": "true"
},
"value": 1,
"next": {
"kind": "literal",
"value": "nil"
}
}
}
is a dictionary with two entries: 0: false
and true: 1
.
An instr represents a single machine instruction. Instructions are linked together to form instruction streams.
{
"kind": "instr",
"op": <string>,
"imm": <value>,
"k": <instr>,
"debug": <debug>
}
The op
is the name of the instruction. Most instructions take an immediate
imm
value: either a fixnum, a string label, or some other value.
The k
value points to the next instruction,
and is not present on "jump" or "end" instructions.
op |
imm |
---|---|
"typeq" | type |
"quad" | fixnum |
"dict" | "has", "get", "add", "set", "del" |
"deque" | "new", "empty", "push", "pop", "put", "pull", "len" |
"pair" | fixnum |
"part" | fixnum |
"nth" | fixnum |
"push" | value |
"jump" | undefined |
"drop" | fixnum |
"pick" | fixnum |
"dup" | fixnum |
"roll" | fixnum |
"alu" | "not", "and", "or", "xor", "add", "sub", "mul" |
"eq" | value |
"cmp" | "eq", "ge", "gt", "lt", "le", "ne" |
"msg" | fixnum |
"state" | fixnum |
"actor" | "send", "post", "create", "become", "self" |
"end" | "abort", "stop", "commit" |
"sponsor" | "new", "memory", "events", "cycles", "reclaim", "start", "stop" |
"assert" | value |
"debug" | undefined |
In addition, there is an "if" instruction. Rather than having a single
continuation field (k
), it has two: the consequent (t
) and the alternative
(f
).
{
"kind": "instr",
"op": "if",
"t": <instr>,
"f": <instr>
}
For example, the following instruction stream examines the first element of a message. If it is 42, the transaction is committed. Otherwise the transaction is aborted.
{
"kind": "instr",
"op": "msg",
"imm": 1,
"k": {
"kind": "instr",
"op": "eq",
"imm": 42,
"k": {
"kind": "instr",
"op": "if",
"t": {"kind": "instr", "op": "end", "imm": "commit"},
"f": {"kind": "instr", "op": "end", "imm": "abort"}
}
}
}
There are several built-in types. These can be used as immediate values for the "typeq" instruction.
{"kind": "type", "name": "fixnum"}
{"kind": "type", "name": "type"}
{"kind": "type", "name": "pair"}
{"kind": "type", "name": "dict"}
{"kind": "type", "name": "instr"}
{"kind": "type", "name": "actor"}
There are also custom types that can appear in the t field of a quad.
{
"kind": "type",
"arity": <fixnum>
}
A custom type's arity controls the number of quad fields expected:
Arity | Quad fields |
---|---|
0 | |
1 | x |
2 | x, y |
3 | x, y, z |
Pairs, dicts and instructions are all instances of a quad. A quad has four
fields, t, x, y, and z. It is possible to specify arbitrary quads,
where the t field is a type (either built in, like #pair
, or a custom
type).
{
"kind": "quad",
"t": <type>,
"x": <value>,
"y": <value>,
"z": <value>
}
The type's arity controls which of x, y, and z are required. For example,
the #pair
type has an arity of 2, so both the x and y fields are required,
and a z field is not allowed:
{
"kind": "quad",
"t": {"kind": "type", "name": "pair"},
"x": 1,
"y": -1
}
An optional debug
property may appear on any kind of object. Its purpose it to
provide debugging information.
The optional src string locates the source text.
The optional start and end positions define a range, where start is inclusive and end is exclusive. A position counts the number of Unicode code points from the beginning of the source.
{
"kind": "debug",
"src": <string>,
"start": <number>,
"end": <number>
}