Skip to content

Data Types

Andrew Johnson edited this page Jun 2, 2025 · 13 revisions

The LM type system is fairly complex due to the language support for features from functional programming, logic programming, as well as low-level C-like memory configuration.

What is a Datatype?

To start to understand the LM type system, we can start with a single atom: A. Because this type is uppercase we know that it corresponds to a named property; It says something.

# this term a, has the type A
a :: A

# we can also rephrase this as
# we know A about a

Types can say many different things about a term, but first we will introduce datatype concepts. A datatype is a type that defines what data is available from a term. In LM datatypes must have a representation in finite memory. Mathematical concepts are available in theory, but if they have no defined representation then we do not call them datatypes.

Datatypes can be either simple or complex. A simple datatype represents a sized memory region where the size is derived by direct definition. A complex datatype represents a sized memory region where the size is derived by structural inference. This sounds complex but is much easier to understand by example:

# An 8-byte unsigned integer
type U64 size 8_B implies Number;

# A two dimensional point
type Point<x: Number, y: Number> = { x: x, y: y };

Here we have introduced some new notation for the definition of the Point data type. First we say that U64 is an 8 byte Number. Second, the Point datatype is defined as two numbers: x and y. U64 is said to be simple because its size is directly declared. Point is defined with a complex structure.

What is a Representation?

Both simple and complex datatypes need a representation before we can run them on a computer. An 8-byte integer can be directly represented with 8 sequential bytes. A single Boolean true or false might be defined as a bit, but usually that will get promoted to at least one byte.

The representation of the complex Point type is inferred by the compiler. This layout can be specified with layout annotations. After representation selection, complex types have a single unambiguous size, just like simple types. Representation information can be accessed with helpers such as sizeof or alignof.

Representation selection happens during type inference and can be affected by many different factors. Closures require one of the most complex forms of representation selection where captured variables define a struct that gets attached to the simple function that will be called with closure calling convention. Typically the user will only see simple arrow types A → B but behind the scenes there are many moving parts.

What is an Interface?

Interfaces are the method by which programmers can access existential quantifiers within the type system natively. The basic pattern is that for a given type that implements an interface, there must exist some matching definition in scope.

interface self implements ToString {
   let .into(self: self, into: Type<String>): String;
};

type Point<x: Number, y: Number> implements ToString = { x: x, y: y };

This definition asserts that the Point type has an acceptable function that can turn it into a String.

What is a Type Alias?

LM supports two different types of type aliases: normal and opaque.

A normal alias will immediately rewrite all occurrences of its left-hand-type with its right-hand-type. An opaque alias will wait until after typechecking to rewrite its aliases.

type alias A = B;
# A looks like B to the type system

type opaque alias B = C;
# B looks like C to the backend code generator

What is a Strong Implication?

LM types are allowed to imply other types. When a given precondition is matched, the type system can automatically infer something else. This is implemented as a quick prop. A common use for this feature is to create type hierarchies such as numerical pyramids:

type U8 implies U16;
type U16 implies U32;
type U32 implies U64;

What is a Tagged Union?

LM types natively support tagged unions. They are declared with a similar syntax to normal structs, just separated by case.

# ABC can be an A or a B or a C
type ABC = A { a: U8 } | B { b: U16 } | C { c: U32 };

let abc = A(1);

# in order to access case fields, the type tag must be known
print( (abc as Tag::A).a )

# if you don't know what tag is active, then you can check the discriminator
print( abc.discriminator-case-tag );

if abc.discriminator-case-tag == (abc as Tag::A).discriminator-case-tag then {
   # this will only execute if the discriminator is set for case A
   print("Case A");
};
Clone this wiki locally