Specification

NOTE: This is very much still work in progress and does not necessarily describe the language as implemented in the GitHub repo

Lexical structure

Valid Yo source code is written in ASCII. Some UTF-8 codepoints will probably work in identifiers and string literals, but there’s no proper handling for characters outside the ASCII character set.

Comments

There are two kinds of comments:

Tokens

The Yo lexer differentiates between the following kinds of tokens: keywords, identifiers, punctuation and literals.

Keywords

Yo reserves the following keywords:

defer else fn for if impl in let mut match return struct use var while

Identifiers

An identifier is a sequence of one or more letters or digits. The first element must not be a digit.

digit  = [0-9]
letter = [a-zA-Z_]
ident  = <letter> (<letter>|<digit>)*

A sequence of characters that satisfies the ident pattern above and is not a reserved keyword is assumed to be an identifier.

Operators and punctuation

The following character sequences represent operators and punctuation:

+    &    &&    ==    |>    (    )
-    |    ||    !=    =     {    }
*    ^          <     !     [    ]
/    <<         <=          .    ;
%    >>         >           ,    :
                >=

Literals

Integer literals

An integer literal is a sequence of digits. Depending on the prefix, the literal is interpreted as base 2, 8, 10 or 16.

Prefix Base
0b binary
0o octal
0x hexadecimal
none decimal
binary_literal   =  0b[01]+       // base 2
octal_literal    =  0o[0-7]+      // base 8
decimal_literal  =  [0-9]+        // base 10
hex_literal      =  0x[0-9a-f]+   // base 16

Floating-point literals

TODO

Character literal

A character literal is a valid ascii codepoint, enclosed by single quotes. TODO

String literals

A string literal is a sequence of valid ascii codepoints enclosed by double quotes.
There are multiple kinds of string literals:

The b and r prefixes can be combined to create a raw bytestring.

Literal Characters Type
"a\nb" a, \n, b *String
r"a\nb" a, \, n, b *String
b"a\nb" a, \n, b *i8
br"a\nb" a, \, n, b *i8

Types

Primitive types

Yo defines the following primitive types:

Typename Size (bytes) Description Values
void 0 the void type n/a
u{N} N/8 unsigned integer type 0 ... 2^N-1
i{N} N/8 signed integer type -2^(N-1) ... 2^(N-1)-1
bool 1 the boolean type true, false
f64 8 IEEE-754 binary64 todo

Functions

fn add(x: i64, y: i64) -> i64 {
    return x + y;
}

External function declarations

The extern annotation can be used to forward-declare a C function’s signature. In forward declarations, the parameter names must be omitted and the function must not contain a body.

#[extern]
fn malloc(i64) -> *i8;

Extern functions can be declared variadic, by inserting ... after the last fixed parameter:

#[extern]
fn printf(*i8, ...) -> i64;

Function pointer types

The following syntax denotes a function pointer type:

(A1, A2, ...) -> R

Example A struct storing a function pointer

struct Foo {
    add: (i64, i64) -> i64
}

Structs

Custom types can be defined using the struct keyword. All struct types are uniquely identified by their name. A struct type can have properties and a set of member functions (methods) associated with it. Member functions must be declared in a separate impl block.

Example: Declaring a struct with properties and member functions

struct Person {
    name: String,
    age: i8
}

impl Person {
    // no `self` parameter -> static method
    fn me() -> *Person {
        return Person::init("Lukas", 20);
    }

    // `self` parameter -> instance method
    fn increaseAge(self: *Person) {
        self.age += 1;
    }
}

Expressions

Every expression evaluates to a value of a specific type, which must be known at compile time.

Type conversions

Operators

Operator overloading

Operators can be overloaded:

fn operator + (lhs: bool, rhs: bool) -> void;

Lambdas

A lambda expression constructs an anynomous function

Attributes

Attributes can be used to provide the compiler with additional knowledge about a declaration.

Function Attributes

Name  
extern C linkage
intrinsic (internal) declares a compile-time intrinsic
no_mangle Don’t mangle the function’s name
mangle={string} Override a function’s mangled name
side_effects(...) Specify a function’s side effects

Struct Attributes

Name  
no_init The compiler should not generate a default initializer for the type

Note:

Example: forward-declaring a variadic C function

#[extern]
fn printf(*i8, ...) -> i64;

// All of the following calls are valid:
printf(b"\n");
printf(b"a: %i\n", 2);
printf(b"other string: %s\n", b"text");

Templates

Templates provide a way to declare a generic implementation of a struct or function.

Templates don’t exist “on their own”: No code is generated when you only declare, but never use a template.
When the compiler encounters an instantiation of a struct template or a call to a template function, it generates a specialized version for the supplied generic arguments.

// A function template
fn add<T>(x: T, y: T) -> T {
    return x + y;
}

Function specializations can be declared simply by overloadding the function for a specific signature.

Memory Management

Yo currently doesn’t have garbage collection / automatic reference counting.

There are two functions of note here: