Files
beginner-friendly-lang/spec.md
2025-09-21 22:14:29 +02:00

665 lines
18 KiB
Markdown

# Contribution Guidelines
- explain why design decisions are taken
- put in `TODO`s into the specification for things that should be re-visited in the future
TODO: t Await type wtih special unify
# Goals
- functional programming: everything is pure, and only a function
- easy to learn, even as first programming language
- only one way of doing things. not 20. Eg: onle list type, instead of 20
- pretty easy to implement
# Data types
One of:
- `t -> t`: pure function taking `t` as argument, and returning `t`
- `{ field1: xt, field2: yt }`: non-nominal records
- `'Case1 t | 'Case2 t2 | 'Case3 t3`: tagged unions
- `with myGenericName: x`: introduce "generic" type `myGenericName`, which can be used in type `x`
- `&a 'EOL | 'Cons {v: t, next: a}`: [recursive type](##recursive-types)
- `?Identifier`: the partial type `Identifier`
- `'Case1 t1 | 'Case2 t2 | ...?Identifier`: partial tagged union type, `Identifier`
- `{field1: xt, field2: yt, ...?Identifier}`: partial non-nominal record type, `Identifier`
## Generic types
Means that the structure of that value is not available at compile time.
This implies that you can dispatch on those values.
This means that you can not write a magic `List.toString` function, because you would have to dispatch on `toString`,
so you have to do something like this:
```
def List.toString =
toStr: (t -> Char List) ->
list: t List ->
...
```
This is also the case for equality, and `List.contains`:
```
def List.any =
fn: (t -> Bool) ->
list: t List ->
...
List.any(x -> x = y, list)
```
## Recursive types
Recursive / self-referential types
These are not allowed to have an infinite size, for example these are illegal:
```
type Illegal1 = &a a
type Illegal2 = &a {x: a, y: Int8}
```
The definition of "infinite size" is:
- for records: if _any_ of the field types has "infinite size"
- for unions: if _all_ of the cases has "infinite size"
- for functions: never
- for recursive types:
- if the type was already visited: yes
- otherwise: if the inner type has "infinite size"
## Partial types
These are types where the full definition is somewhere else (eg another compilation unit)
This type naturally adds limitations:
- for partial non-nominal records: these can not be constructed with the record construct expression
- for partial unions: they can not be pattern matched exhaustively
these limitations are removed, if the full definition is in the current scope
<details>
<summary>Example</summary>
Compilation unit 1:
```
type ?Uint8 = {field:type}
type Uint8 = {...?Uint8}
# this compilation does not have the limitations of partial types, because the definition for `?Uint8` is available
```
Compilation unit 2:
```
type Uint8 = {...?Uint8}
```
</details>
## Extensible unions
The difference between these and partial unions is that many compilation units can extend one extensible union.
This means that no compilation unit can do exhaustive pattern matching
```
extensible union Plan
extend Plan with 'ReadlnPlan Unit
extend Plan with 'WritelnPlan Unit
# pattern matching against these is always non-exhaustive.
# can only pattern match against the imported extensions
```
## Compatible-with
Check if type is compatible-with / assignable-to type requirements.
- a type is compatible with itself ofcourse
- a tagged union is compatible with anothr,
if the other one contains at least all cases from this union,
and those cases are compatible
- a non-nominal record is compatbile with anohter,
if the other one contains a _subset_ of the fields of this one,
and the field types are comptible.
- a type is compatible with a partial type `?Ty`,
if the definition of the partial type is in the current scope,
and the type is compatible with the type inside the definition
- partial records and unions are similar to the above.
TODO: need more details?
## Phi unification
This process is performed on the resulting types of two merging branches
Tries, in order:
- if one of the types is a tagged union,
which contains only one case with an identical type,
and the other is not a tagged union,
it will "attach" that case onto the other type, like this:
`'N Num | 'I Int` and `Num` -> `'N Num | 'I Int`
- if one of the types is a tagged union with exactly one case,
and the other one is not an union,
it will put that tag onto it, and phi-merge the inner types.
example: `'N Num` and `Num` -> `'N Num`
- if both types are tagged unions,
the result is an tagged union contain the csaes of both tagged unions.
union cases that are in both branches, will have their types phi-unified too.
example: `'Ok ('N Num | 'I Int) | 'SomeErr` and `'Ok Num | 'AnotherErr` -> `'Ok ('N Num) | 'SomeErr | 'AnotherErr`
- if both types are non-nominal records,
the result will contain only fields that are present in both types.
The types of the fields will get phi-unified too
example: `{a:Num,b:Num,c:Num}` and `{a:Num,c:Num,e:Num}` -> `{a:Num, e:Num}`
# "Platform library"
Depends on on the [core library](#core-library).
Platform / language implementation / target depedent.
<details>
<summary>Signature</summary>
```
############ Primitive Data types ############
type Uint8
def Uint8.wrapping_add : Uint8 -> Uint8 -> Uint8
def Uint8.bitwise_not : Uint8 -> Uint8
def Uint8.less_than : Uint8 -> Uint8 -> Bool
def Uint8.bits : Uint8 -> {7:Bool,6:Bool,5:Bool,4:Bool,3:Bool,2:Bool,1:Bool,0:Bool}
```
</details>
# "Core library"
Depend on the [platform library](#platform-library).
Completely platform, language implementation, and target independent.
<details>
<summary>Signature</summary>
```
type Unit = {}
######### Primitive storage-only types ##########
type Int8 = Uint8
type Int16 = {hi:Int8, lo:Int8}
type Int32 = {hi:Int16, lo:Int16}
type Int64 = {hi:Int32, lo:Int32}
type Int128 = {hi:Int64, lo:Int64}
############ Bool ############
type Bool = 'True | 'False
def Bool.not : Bool -> Bool
def Bool.and : Bool -> Bool -> Bool
def Bool.or : Bool -> Bool -> Bool
def Bool.nor : Bool -> Bool -> Bool
def Bool.xor : Bool -> Bool -> Bool
############ Num ############
## arbitrary-precision signed decimal number
type Num
def Num.toStr : Num -> Char List
# consider using locale aware, and space ignoring parsing instead
# parses of format 123 / -123 / 123.456
def Num.parseLit : Char List -> Num Option
def Num.zero : Num
def Num.one : Num
def `a+b` : Num -> Num -> Num
def `a-b` : Num -> Num -> Num
def `a*b` : Num -> Num -> Num
def `a/b` : Num -> Num -> Num
def Num.neg : Num -> Num
############ Char ############
# one unicode codepoint
type Char
def Char.toAscii : Char -> Uint8 Option
def Char.fromAscii : Uint8 -> Char Option
def Char.encodeUtf8 : Char -> Uint8 List
def Char.decodeUtf8 : Uint8 List -> 'Cons {v: Char, next: Uint8 List}
############ String ############
def String.encodeUtf8 : Char List -> Utf8 List
def String.decodeUtf8 : Uint8 List -> 'Ok Char List | 'Err {at: Num}
############ t List ############
## A generic linked-list type
type t List = 'EOL | 'Cons {v: t, next: t List}
## List concatenate
def `a++b` : t List -> t List -> t List
## Get nth element
def List.nth : t List -> Num -> t Option
############ t Option ############
type t Option = 'None | 'Some t
```
</details>
<details>
<summary>Reference Implementation</summary>
TODO
</details>
# "Standard library"
Depend on the [platform library](#platform-library) and [core library](#core-library).
Responsible for IO
<details>
<summary>Signature</summary>
```
############ IO ############
type t IO
def IO.just : t -> t IO
def IO.chain : t IO -> (t -> r IO) -> r IO
############ Stream ############
# represents a non-seekable t-stream, for example a tcp stream
type t io.Stream
TODO: networking api: stream and datagram protocols
############ Path ############
# paths are not absolute, but rather relative to any reference point
type io.Path = {parent: io.Path Option, ...?io.Path}
def io.Path.eq : io.Path -> io.Path -> Bool
def io.Path.root : io.Path -> io.Path
def io.Path.child : io.Path -> Char List -> io.Path Option
TODO: links, permissions, ... etc
############ File ############
type io.ReadFile
type io.WriteFile
type t io.FileOpenResult = 'Ok t | 'CantReopenThis | 'FileNotFound | 'PermissionError | 'OtherErr
def io.file.open.r : io.Path -> io.ReadFile io.FileOpenResult
def io.file.open.w : io.Path -> io.WriteFile io.FileOpenResult
def io.file.open.rw : io.Path -> {r: io.ReadFile, w: io.WriteFile} io.FileOpenResult
TODO: read, write, extend, truncate
```
</details>
<details>
<summary>Implementation snippets</summary>
TODO: example definition of IO
</details>
# (top-level) declarations
TODO
TODO: add either attribute system, or comptime exec
# Mostly forward type inference
Exception 1:
```
def List.map : t List -> (t -> r) -> r List
# doesn't require specifying types in lambda: simple one-step backwards type inference
List.map([1,2,3], x -> x * 2)
```
Exception 2:
```
def add : Num -> Num -> Num =
a -> b -> a + b
# ^^^^^^
# does not require specification of those types, because already specified in function signature
# this also applies to:
def add : Num -> Num -> Num
def add = a -> b -> a + b
```
# Expressions
- `expr:field` access field from non-nominal record
- `let varName = value exprUsingTheVar`
- `'a'` unicode codepoint literal, aka "char" literal. returns a `Char`
- `[1,2,3]` list literal: creates a `t List`, where `t` is equal to the type of all the wrapped expressions
- `"Hello\\nworld"` string literal with escape characters. same behaviour as list literal of the chars
- `12.345`, or `12`, or `-12`: same behaviour as `Num.parseLit` on the value (as string)
- `arg -> value`: one-step backwards inferrable lambda
- `arg: Type -> value`
- `func(arg1, arg2)`: function application. requires at least one argument. partial function applications are allowed too
- `func(_, arg)`: bind some args from function, return funciton taking in the args with `_`, or the non-specified ones
- `expr :: type`: down-cast type
- `recExpr and otherRecExpr`:
if both types are record types, or tagged record types:
- "sum" fields together of both record expressions.
type checking: phi-unify `recExpr` with `otherRecExpr`, and require that both are non-nominal record types
- else: identical to `\`a and b\`(a, b)`
- `if cond then a else b`
- `{field1: val1, field2: val2}` field construction
- `match expr with <match cases>`: [pattern matching](##pattern-matching)
- `'Tag expr`
- `a = b` the [equality operator](##equality-operator)
- `-a`: identical to `\`-a\`(a)` name: "negate"
- `a or b`: identical to `\`a or b\`(a, b)` name: "or"
- `a ^ b`: identical to `\`a^b\`(a, b)` name: "exponent"
- `a + b`: identical to `\`a+b\`(a, b)` name: "sum"
- `a - b`: identical to `\`a-b\`(a, b)` name: "difference"
- `a * b`: identical to `\`a*b\`(a, b)` name: "times"
- `a / b`: identical to `\`a/b\`(a, b)` name: "over"
- `a ++ b`: identical to `\`a++b\`(a,b)` name: "list concatenate"
- `a => b`: identical to `\`a=>b\`(a,b)` name: "lens compose"
## pattern matching
TODO
## equality operator
The only operator with type overloading
Compares any two types that can be [phi unified](#phi-unification), returning
either `'True Unit` `'False Unit` (which maps to `Bool`)
# (for users) programming patterns
## labelled arguments
```
def List.remove_prefix(prefix: t List, list: 'from t List)
List.remove_prefix([1,2], 'from [1,2,3,4])
# but this also works most of the times:
List.remove_prefix([1,2], [1,2,3,4])
```
## promising a list of at least one element
Returning a `'Cons {v: Int, next: Int List}`
indicates that the list contains at least one element.
And this is even assignable to `Int List`
# (for implementors) recommendations
## Don't do deep-copies of type objects
types could have hundreds of enum cases
## Store original type alias names in types, as hints
To give more information to the user in error messages
Which of these is more readable and user friendly...
- `&a {v: 'None | 'Some {high:Int16,low:Int16}, next: a}`
- `Int32 Option List`
## Perform fuzzy match to give suggestions for correcting errors
Non-nominal record types have the issue of typos propagating really far trough the code,
making them hard to debug.
Give fuzzy-matched suggestions for at the following things:
- function names
- local names
- non-nominal record fields
- union tags
## Store as much debug in fromation as possible until type checking is done
# Compilation units
Each file is a "compilation unit"
When compiling a compilation unit, the following inputs have to be provided:
- any amount of files containing signatures of exported definitions.
only definitions of the compilation unit that are in one of the signature files will get exported.
- any amount of other files containing imported definitions
Note that there is no practical difference between signature and source files.
<details>
<summary>Example</summary>
Export signature file `List.li`:
```
type List[t] = 'End | 'Cons {head:t, tail:List[t]}
# not providing a function body makes it a function signature definition
def [t] `a++b`(a: List[t], b: List[t]) -> List[t]
```
Import signature file `Option.li`:
```
type Option[t] = 'None | 'Some t
```
Compilation unit `List.lu`:
```
def [t] `a++b`(a: List[t], b: List[t]) -> List[t] {
# ...
}
```
</details>
# Syntax
Note that the order of syntax cases matter!
```
Identifier = { (* ( a-z | A-Z | 0-9 | _ | - | . | # ) TODO: cvt to ebnf *) };
Partial Type = '?', Identifier;
Union Type Case = '\'', Identifier, [ Type (*default to Unit*) ];
Union Type = Union Type Case, { '|', Union Type Case};
Record Type Field = Identifier, ':', Type;
Record Type = '{', [Record Type Field, {',' Record Type Field }], [ '...', Partial Type ], '}';
With Type = 'with', Identifier, {',', Identifier}, ':', Type;
Recursive Type = '&', Identifier, Type;
Nested Type = '(', Type, ')';
Multi Arg Type = '[', Type, {',', Type} ']', Type;
Auto Type = '_'; (* this only works as return type of signatures of func def, that also contain the impl *)
Type Part = Auto Type | Identifier | Partial Type | Union Type | Record Type | With Type | Recursive Type | Nested Type;
Type = {Type Part (* args *)}, Type Part;
```
# OLD SPECIFICATION STARTING HERE
## pattern matching
```
type Option[t] = 'None | 'Some t
def [t] Match.`a++b`(
# matching against this value
value: List[t],
# left hand side of operator
l: List[t],
# right hand side of operator
r: MatchUtil.Var[List[t]]
) -> Option[{ r: List[t] }] {
match List.remove_prefix(l, 'from value) {
'Some rem -> 'Some { r: rem }
'None -> 'None
}
}
```
then you can do:
```
type Token = 'Public | 'Private | 'Err
def example(li: List[Char]) -> {t:Token,rem:List[Char]} {
match li {
"public" ++ rem -> {t: 'Public Unit, rem:rem}
"private" ++ rem -> {t: 'Private Unit, rem:rem}
_ -> {t: 'Err Unit, rem: li}
}
}
```
## Lenses
In the stdlib:
```
type Lens[t,f] = {get: t -> f, set: t -> f -> t}
def [a,b,c] `a=>b`(x: Lens[a,b], y: Lens[b,c]) -> Lens[a,c] {
{
get: t:a -> y:get(x:get(t)),
set: t:a -> f:c -> x:set(t, y:set(x:get(t), f))
}
}
```
Since `a:f1:f2` is the field access syntax, the lens creation syntax is similar: `&Type:field1:field2`
So you can do:
```
type Header = {text: String, x: Num}
type Meta = {header: Header, name: String}
&Header:header:text (myHeader, "new meta:header:text value")
# which is identical to:
(&Header:header => &Meta:text) (myHeader, "new meta:header:text value")
```
However, it is cleaner to use `with`:
```
myHeader with header:text: "new meta:header:text value"
```
## Pure IO
the `IO[t]` type contains IO "plans" which will be executed by the runtime, if they are returned by the main function.
```
def main() -> IO[Unit] {
print("hey") # warning: result (of type IO[Unit]) was ignored. expression can be removed
print("hello")
}
# this will only print "hello"
```
To make the above example print both "hey" and "hello", we need to chain the two IO types:
```
def main() -> IO[Unit] {
await _ = print("hey")
await print("hello")
}
# or just remove the _
def main() -> IO[Unit] {
await print("hey")
await print("hello")
}
# if you don't put in the second await:
def main() -> IO[Unit] {
await print("hey")
print("hello")
# error: expected IO[Unit], got IO[IO[Unit]]; did you forget an `await`?
}
```
await is kinda weird. here is the syntax:
```
# ( await x: a = ( expr1::IO[a] ) await (expr2::IO[b]) ) :: IO[b]
expr |= 'await', (identifier, '='), expr, 'await', expr
# ( await x: a = ( expr1::IO[a] ) ( expr2::b ) ) :: IO[b]
expr |= 'await', (identifier, '='), expr, expr
# ( await (expr1::a) ) :: a
expr |= 'await', expr
```
## Pure IO implementation
Something like this is done in the stdlib:
```
extensible union IO.Plan[r]
type IO[t] = 'Just {value: t}
| 'Map template r: {of: IO[r], map: r -> IO[t]}
| 'More template r: {plan: IO.Plan[r], then: r -> IO[t]}
def [a,b] `await a (a->b)`(io: IO[a], then: a -> b) -> IO[b] {
'Map {of: io, map: r -> 'Just {value: then(r)}}
}
def [a,b] `await a (a->await b)`(io: IO[a], then: a -> IO[b]) -> IO[b] {
'Map {of: io, map: r -> then(r)}
}
# in stdio:
extend IO.Plan[Uint8] with 'stdio.ReadByte {stream: Int32}
def stdio.getchar : IO[Uint8] = 'More {plan: 'stdio.ReadByte {stream: 0}, then: by -> 'Just by[0]}
def main() -> IO[Unit]
```
the runtime does something like this:
```
def [a] RUNTIME_EVAL(io: IO[a]) -> a {
match io {
'Just {value} -> value
'Map {of, map} -> RUNTIME_EVAL(map(RUNTIME_EVAL(of)))
'More {plan, then, finally} -> RUNTIME_EVAL(then(match plan {
'ReadStream {stream} -> 'ReadStreamIOResult {data: impure perform the io here lol}
_ -> impure error here "this runtime doesn't support this kind of IO" or sth
}))
}
}
def RUNTIME_ENTRY() {
RUNTIME_EVAL ( main() )
}
```