Skip to content

Datamodel

Louis Goessling edited this page Jul 24, 2017 · 2 revisions

The orth datamodel is very simple. Integral types are all that are available as first-class types. These are:

  • Signed integers: int, long, short, byte, xlong
  • Unsigned integers: uint, ulong, ushort, ubyte, uxlong
  • Booleans: bool
  • Pointer types: cstr and ptr
  • Pointers to user-defined types

It is a little weird to think of pointer types as scalars, but it is true. When you have a variable that you have declared as having a type representing a user-defined record type, the variable itself is a pointer to that type. For example, with a type

type Person is
    cstr first_name
    cstr last_name
    int age
endtype

declaring a local Person bob does not allocate space for a Person, just the pointer-to-Person bob. This is transparent, and as a result the code looks somewhat like C code if you were using struct values - the dot operator in orth is the equivilent of C's ->.

The OOP tools in orth are minimal - member functions via call rewriting. If we define a method

function(Person self) Person:say_hi -> void does
    printf("Hi! I am %s %s, I am %i years old.\n", self.first_name, self.last_name, self.age)
return

then we can call that on a Person bob via bob:say_hi(). At compile time the symbol name for Person:say_hi is rewritten as Person$say_hi, and calling bob:say_hi() is rewritten to calling Person$say_hi(bob). This metaphor is literally done at AST rewriting time!

This simple facility allows code that looks something like what a more fully featured OOP language would provide. It can be combined with casting and function pointers/typeclasses for runtime polymorphism and dynamic dispatch if you should so desire. This approach is used throughout shoc.

cstr and ptr, are both interally equivilent to the C type char*, but semantically different. ptr is the base pointer type - use it as a pointer-to-byte or as a intermediate result in an offset calculation. cstr should always represent something that you could call puts on and get sane results, but this is not enforced. It is mainly for ease-of-use in string handling functions (e.g. + is implemented as strcat and * python-style 😄 )

Clone this wiki locally