Files
beginner-friendly-lang/spec.md
2025-09-15 21:42:29 +02:00

19 KiB

Contribution Guidelines

  • explain why design decisions are taken
  • put in TODOs into the specification for things that should be re-visited in the future

Goals

  • functional programming: everything is pure, and only a function
  • easy to learn, even as first programming language
  • only one way of doing things. not 20. Eg: onle list type, instead of 20
  • pretty easy to implement

Data types

One of:

  • t -> t: pure function taking t as argument, and returning t
  • { field1: xt, field2: yt }: non-nominal records
  • 'Case1 t | 'Case2 t2 | 'Case3 t3: tagged unions
  • with myGenericName: x: introduce "generic" type myGenericName, which can be used in type x
  • &a 'EOL | 'Cons {v: t, next: a}: recursive type
  • ?Identifier: the partial type Identifier
  • 'Case1 t1 | 'Case2 t2 | ...?Identifier: partial tagged union type, Identifier
  • {field1: xt, field2: yt, ...?Identifier}: partial non-nominal record type, Identifier

Generic types

Means that the structure of that value is not available at compile time. This implies that you can dispatch on those values. This means that you can not write a magic List.toString function, because you would have to dispatch on toString, so you have to do something like this:

def List.toString =
  toStr: (t -> Char List) ->
  list: t List ->
  ...

This is also the case for equality, and List.contains:

def List.any =
  fn: (t -> Bool) ->
  list: t List ->
  ...

List.any(x -> x = y, list)

Recursive types

Recursive / self-referential types

These are not allowed to have an infinite size, for example these are illegal:

type Illegal1 = &a a

type Illegal2 = &a {x: a, y: Int8}

The definition of "infinite size" is:

  • for records: if any of the field types has "infinite size"
  • for unions: if all of the cases has "infinite size"
  • for functions: never
  • for recursive types:
    • if the type was already visited: yes
    • otherwise: if the inner type has "infinite size"

Partial types

These are types where the full definition is somewhere else (eg another compilation unit)

This type naturally adds limitations:

  • for partial non-nominal records: these can not be constructed with the record construct expression
  • for partial unions: they can not be pattern matched exhaustively

these limitations are removed, if the full definition is in the current scope

Example

Compilation unit 1:

type ?Uint8 = {field:type}
type Uint8 = {...?Uint8}

# this compilation does not have the limitations of partial types, because the definition for `?Uint8` is available

Compilation unit 2:

type Uint8 = {...?Uint8}

Compatible-with

Check if type is compatible-with / assignable-to type requirements.

  • a type is compatible with itself ofcourse
  • a tagged union is compatible with anothr, if the other one contains at least all cases from this union, and those cases are compatible
  • a non-nominal record is compatbile with anohter, if the other one contains a subset of the fields of this one, and the field types are comptible.
  • a type is compatible with a partial type ?Ty, if the definition of the partial type is in the current scope, and the type is compatible with the type inside the definition
  • partial records and unions are similar to the above.

TODO: need more details?

Phi unification

This process is performed on the resulting types of two merging branches

Tries, in order:

  • if one of the types is a tagged union, which contains only one case with an identical type, and the other is not a tagged union, it will "attach" that case onto the other type, like this: 'N Num | 'I Int and Num -> 'N Num | 'I Int
  • if one of the types is a tagged union with exactly one case, and the other one is not an union, it will put that tag onto it, and phi-merge the inner types. example: 'N Num and Num -> 'N Num
  • if both types are tagged unions, the result is an tagged union contain the csaes of both tagged unions. union cases that are in both branches, will have their types phi-unified too. example: 'Ok ('N Num | 'I Int) | 'SomeErr and 'Ok Num | 'AnotherErr -> 'Ok ('N Num) | 'SomeErr | 'AnotherErr
  • if both types are non-nominal records, the result will contain only fields that are present in both types. The types of the fields will get phi-unified too example: {a:Num,b:Num,c:Num} and {a:Num,c:Num,e:Num} -> {a:Num, e:Num}

"Platform library"

Depends on on the core library.

Platform / language implementation / target depedent.

Signature
############  Primitive Data types  ############
type Uint8

def Uint8.wrapping_add : Uint8 -> Uint8 -> Uint8
def Uint8.bitwise_not  : Uint8 -> Uint8
def Uint8.less_than    : Uint8 -> Uint8 -> Bool
def Uint8.bits         : Uint8 -> {7:Bool,6:Bool,5:Bool,4:Bool,3:Bool,2:Bool,1:Bool,0:Bool}

"Core library"

Depend on the platform library.

Completely platform, language implementation, and target independent.

Signature
type Unit = {}


#########  Primitive storage-only types   ##########
type Int8   = Uint8
type Int16  = {hi:Int8,  lo:Int8}
type Int32  = {hi:Int16, lo:Int16}
type Int64  = {hi:Int32, lo:Int32}
type Int128 = {hi:Int64, lo:Int64}



############  Bool   ############
type Bool = 'True | 'False
def Bool.not : Bool -> Bool
def Bool.and : Bool -> Bool -> Bool
def Bool.or : Bool -> Bool -> Bool
def Bool.nor : Bool -> Bool -> Bool
def Bool.xor : Bool -> Bool -> Bool


############  Num   ############
## arbitrary-precision signed decimal number
type Num

def Num.toStr : Num -> Char List

# consider using locale aware, and space ignoring parsing instead
# parses of format 123  /  -123  /  123.456
def Num.parseLit : Char List -> Num Option

def Num.zero : Num
def Num.one  : Num

def `a+b` : Num -> Num -> Num
def `a-b` : Num -> Num -> Num
def `a*b` : Num -> Num -> Num
def `a/b` : Num -> Num -> Num

def Num.neg : Num -> Num


############  Char  ############
# one unicode codepoint
type Char
def Char.toAscii    : Char  -> Uint8 Option
def Char.fromAscii  : Uint8 -> Char Option
def Char.encodeUtf8 : Char  -> Uint8 List
def Char.decodeUtf8 : Uint8 List  -> 'Cons {v: Char, next: Uint8 List}


############  String  ############
def String.encodeUtf8 : Char List -> Utf8 List
def String.decodeUtf8 : Uint8 List -> 'Ok Char List | 'Err {at: Num}


############  t List  ############
## A generic linked-list type
type t List = 'EOL | 'Cons {v: t, next: t List}

## List concatenate
def `a++b` : t List -> t List -> t List

## Get nth element
def List.nth : t List -> Num -> t Option


############  t Option  ############
type t Option = 'None | 'Some t
Reference Implementation

TODO

"Standard library"

Depend on the platform library and core library.

Responsible for IO

Signature
############  IO  ############
type t IO
def IO.just : t -> t IO
def IO.chain : t IO -> (t -> r IO) -> r IO


############  Stream  ############
# represents a non-seekable t-stream, for example a tcp stream
type t io.Stream

TODO: networking api: stream and datagram protocols


############  Path  ############
# paths are not absolute, but rather relative to any reference point 
type io.Path = {parent: io.Path Option, ...?io.Path}
def io.Path.eq    : io.Path -> io.Path -> Bool
def io.Path.root  : io.Path -> io.Path
def io.Path.child : io.Path -> Char List -> io.Path Option

TODO: links, permissions, ... etc


############  File  ############
type io.ReadFile
type io.WriteFile

type t io.FileOpenResult = 'Ok t | 'CantReopenThis | 'FileNotFound | 'PermissionError | 'OtherErr
def io.file.open.r  : io.Path -> io.ReadFile  io.FileOpenResult
def io.file.open.w  : io.Path -> io.WriteFile io.FileOpenResult
def io.file.open.rw : io.Path -> {r: io.ReadFile, w: io.WriteFile} io.FileOpenResult

TODO: read, write, extend, truncate
Implementation snippets

TODO: example definition of IO

(top-level) declarations

TODO

TODO: add either attribute system, or comptime exec

Mostly forward type inference

Exception 1:

def List.map : t List -> (t -> r) -> r List

# doesn't require specifying types in lambda: simple one-step backwards type inference
List.map([1,2,3], x -> x * 2)

Exception 2:

def add : Num -> Num -> Num =
  a -> b -> a + b
# ^^^^^^
# does not require specification of those types, because already specified in function signature

# this also applies to:

def add : Num -> Num -> Num
def add = a -> b -> a + b

Expressions

  • expr:field access field from non-nominal record
  • let varName = value exprUsingTheVar
  • 'a' unicode codepoint literal, aka "char" literal. returns a Char
  • [1,2,3] list literal: creates a t List, where t is equal to the type of all the wrapped expressions
  • "Hello\\nworld" string literal with escape characters. same behaviour as list literal of the chars
  • 12.345, or 12, or -12: same behaviour as Num.parseLit on the value (as string)
  • arg -> value: one-step backwards inferrable lambda
  • arg: Type -> value
  • func(arg1, arg2): function application. requires at least one argument. partial function applications are allowed too
  • expr :: type: down-cast type
  • recExpr with fieldname: newFieldValue: overwrites or adds a field to a record type. type checking: identical to recExpr and {fieldname: newFieldValue}
  • recExpr and otherRecExpr: "sum" fields together of both record expressions. type checking: phi-unify recExpr with otherRecExpr, and require that both are non-nominal record types
  • if cond then a else b
  • {field1: val1, field2: val2} field construction
  • match expr with <match cases>: pattern matching
  • a = b the equality operator
  • a + b: identical to \a+b`(a, b)` name: "sum"
  • a - b: identical to \a-b`(a, b)` name: "difference"
  • a * b: identical to \a*b`(a, b)` name: "times"
  • a / b: identical to \a/b`(a, b)` name: "over"
  • a ++ b: identical to \a++b`(a,b)` name: "list concatenate"
  • a => b: identical to \a=>b`(a,b)` name: "lens compose"

pattern matching

TODO

equality operator

The only operator with type overloading

TODO

(for users) programming patterns

labelled arguments

def List.remove_prefix(prefix: t List, list: 'from t List)

List.remove_prefix([1,2], 'from [1,2,3,4])

# but this also works most of the times:
List.remove_prefix([1,2], [1,2,3,4])

promising a list of at least one element

Returning a 'Cons {v: Int, next: Int List} indicates that the list contains at least one element. And this is even assignable to Int List

(for implementors) recommendations

Don't do deep-copies of type objects

types could have hundreds of enum cases

Store original type alias names in types, as hints

To give more information to the user in error messages

Which of these is more readable and user friendly...

  • &a {v: 'None | 'Some {high:Int16,low:Int16}, next: a}
  • Int32 Option List

Perform fuzzy match to give suggestions for correcting errors

Non-nominal record types have the issue of typos propagating really far trough the code, making them hard to debug.

Give fuzzy-matched suggestions for at the following things:

  • function names
  • local names
  • non-nominal record fields
  • union tags

Store as much debug in fromation as possible until type checking is done

Syntax

Note that the order of syntax cases matter!

Identifier = { (* ( a-z | A-Z | 0-9 | _ | - | . | # )    TODO: cvt to ebnf *) };
Partial Type = '?', Identifier;
Union Type Case = '\'', Identifier, [ Type (*default to Unit*) ];
Union Type = Union Type Case, { '|', Union Type Case};
Record Type Field = Identifier, ':',  Type;
Record Type = '{', [Record Type Field, {',' Record Type Field }], [ '...', Partial Type ], '}';
Type = Partial Type | Identifier | Union Type | Record Type | TODO;

OLD SPECIFICATION STARTING HERE

automatic return types

def add(a: Num, b: Num) -> _ {
  a + b
}

pattern matching

type Option[t] = 'None | 'Some t

def [t] Match.`a++b`(
  # matching against this value
  value: List[t],
  # left hand side of operator
  l: List[t],
  # right hand side of operator
  r: MatchUtil.Var[List[t]]
) -> Option[{ r: List[t] }] {
  match List.remove_prefix(l, 'from value) {
    'Some rem -> 'Some { r: rem }
    'None -> 'None
  }
}

then you can do:

type Token = 'Public | 'Private | 'Err
def example(li: List[Char]) -> {t:Token,rem:List[Char]} {
  match li {
    "public" ++ rem  -> {t: 'Public Unit, rem:rem}
    "private" ++ rem -> {t: 'Private Unit, rem:rem}
    _ -> {t: 'Err Unit, rem: li}
  }
}

recursive data types

type List[t] = 'End | 'Cons {head:t, tail:List[t]}
# now you might notice an issue with this
# `type` defines non-distinct type alisases
# so what is the type of this...
# Introducing: type self references
# the above example is the same as this:
type List[t] = &a ('End | 'Cons {head:t, tail:a})

# example 2:
# a List[List[t]] is just:
&b ('End | 'Cons {head: &a ('End | 'Cons {head:t, tail:a}), tail: b})

Infinitely sized types are not allowed:

&a {x:Num, y:a}

However, infinite types without size are allowed:

&a {x:a}

This is not allowed:

&a a

module system

Each file is a "compilation unit"

When compiling a compilation unit, the following inputs have to be provided:

  • any amount of files containing signatures of exported definitions. only definitions of the compilation unit that are in one of the signature files will get exported.
  • any amount of other files containing imported definitions

Note that there is no practical difference between signature and source files.

Example

Export signature file List.li:

type List[t] = 'End | 'Cons {head:t, tail:List[t]}

# not providing a function body makes it a function signature definition
def [t] `a++b`(a: List[t], b: List[t]) -> List[t]

def [t] Match.`a++b`(
  value: List[t],
  l: List[t],
  r: MatchUtil.Var[List[t]]
) -> Option[{ r: List[t] }]

Import signature file Option.li:

type Option[t] = 'None | 'Some t

Compilation unit List.lu:

def [t] `a++b`(a: List[t], b: List[t]) -> List[t] {
  # ...
}

def [t] Match.`a++b`(
  value: List[t],
  l: List[t],
  r: MatchUtil.Var[List[t]]
) -> Option[{ r: List[t] }] {
  # ...
}

Notes

Each compilation unit gets compiled to implementation-specific bytecode.

Templated functions can only be compiled partially during a compilation unit. This will impact compile speeds.

Avoid templated functions wherever possible.

Hide record fields in module signatures

Signatue:

type User = {name: List[Char], ...}
  # the ... is used to indicate that this is a partial type definition
  # User, as given here, can not be constructed, but name can be accessed

Compilation unit:

type User = {name: List[Char], password: List[Char]}

Hide union variants in module signatures

Signature:

type DType = 'Int | 'UInt | 'Byte | ...
  # users of this can never do exhaustive pattern matching on this

Compilation unit:

type DType = 'Int | 'UInt | 'Byte
  # this compilation unit can actually do exhaustive pattern matching on this

Note on hidden union variants / record fields

To make these work, the following is legal:

def example() -> 'Int | 'UInt | ...

def example() -> 'Int | 'UInt | 'Byte | 'Char {
  # ...
}

Extensible unions

extensible union Plan

extend Plan with 'ReadlnPlan Unit
extend Plan with 'WritelnPlan Unit

# pattern matching against these is always non-exhaustive.
# can only pattern match with the imported extensions 

Any type

in the stdlib:

extensible union Any

type Any.LambdaCalc = 'Apply {fn: Any.LambdaCalc, arg: Any.LambdaCalc}
                    | 'Scope {idx: Uint}
                    | 'Abstr {inner: Any.LambdaCalc}
def Any.toLambda(a: Any) -> Any.LambdaCalc

It gets automatically extended with every type ever used.

Lenses

In the stdlib:

type Lens[t,f] = {get: t -> f, set: t -> f -> t}

def [a,b,c] `a=>b`(x: Lens[a,b], y: Lens[b,c]) -> Lens[a,c] {
  {
    get: t:a -> y:get(x:get(t)),
    set: t:a -> f:c -> x:set(t, y:set(x:get(t), f))
  }
}

Since a:f1:f2 is the field access syntax, the lens creation syntax is similar: &Type:field1:field2

So you can do:

type Header = {text: String, x: Num}
type Meta = {header: Header, name: String}

&Header:header:text (myHeader, "new meta:header:text value")
# which is identical to:
(&Header:header => &Meta:text) (myHeader, "new meta:header:text value")

However, it is cleaner to use with:

myHeader with header:text: "new meta:header:text value"

Pure IO

the IO[t] type contains IO "plans" which will be executed by the runtime, if they are returned by the main function.

def main() -> IO[Unit] {
  print("hey")    # warning: result (of type IO[Unit]) was ignored. expression can be removed 
  print("hello")
}
# this will only print "hello"

To make the above example print both "hey" and "hello", we need to chain the two IO types:

def main() -> IO[Unit] {
  await _ = print("hey")
  await print("hello")
}

# or just remove the _
def main() -> IO[Unit] {
  await print("hey")
  await print("hello")
}

# if you don't put in the second await:
def main() -> IO[Unit] {
  await print("hey")
  print("hello")
  # error: expected IO[Unit], got IO[IO[Unit]]; did you forget an `await`?
}

await is kinda weird. here is the syntax:

# ( await x: a = ( expr1::IO[a] ) await (expr2::IO[b]) ) :: IO[b]
expr |= 'await', (identifier, '='), expr, 'await', expr

# ( await x: a = ( expr1::IO[a] ) ( expr2::b ) ) :: IO[b]
expr |= 'await', (identifier, '='), expr, expr

# ( await (expr1::a) ) :: a
expr |= 'await', expr

Pure IO implementation

Something like this is done in the stdlib:

extensible union IO.Plan[r]
type IO[t] = 'Just {value: t}
           | 'Map template r: {of: IO[r], map: r -> IO[t]}
           | 'More template r: {plan: IO.Plan[r], then: r -> IO[t]}

def [a,b] `await a (a->b)`(io: IO[a], then: a -> b) -> IO[b] {
  'Map {of: io, map: r -> 'Just {value: then(r)}}
}

def [a,b] `await a (a->await b)`(io: IO[a], then: a -> IO[b]) -> IO[b] {
  'Map {of: io, map: r -> then(r)}
}

# in stdio:
extend IO.Plan[Uint8] with 'stdio.ReadByte {stream: Int32}
def stdio.getchar : IO[Uint8] = 'More {plan: 'stdio.ReadByte {stream: 0}, then: by -> 'Just by[0]}

def main() -> IO[Unit]

the runtime does something like this:

def [a] RUNTIME_EVAL(io: IO[a]) -> a {
  match io {
    'Just {value} -> value
    'Map {of, map} -> RUNTIME_EVAL(map(RUNTIME_EVAL(of)))
    'More {plan, then, finally} -> RUNTIME_EVAL(then(match plan {
      'ReadStream {stream} -> 'ReadStreamIOResult {data: impure perform the io here lol}
      _ -> impure error here "this runtime doesn't support this kind of IO" or sth
    }))
  }
}

def RUNTIME_ENTRY() {
  RUNTIME_EVAL ( main() )
}