Reference

Note This is very much still work in progress and does not necessarily describe the language as implemented in the GitHub repo

Lexical structure

Valid Yo source code is written in ASCII. Some UTF-8 codepoints will probably work in identifiers and string literals, but there’s no proper handling for characters outside the ASCII character set.

Comments

There are two kinds of comments:

Tokens

The Yo lexer differentiates between the following kinds of tokens: keywords, identifiers, punctuation and literals.

Keywords

Yo reserves the following keywords:

decltype defer else fn for if impl in let mut match operator return struct switch unless use var while

Identifiers

An identifier is a sequence of one or more letters or digits. The first element must not be a digit.

digit  = [0-9]
letter = [a-zA-Z_]
ident  = <letter>(<letter>|<digit>)*

A sequence of characters that satisfies the ident pattern above and is not a reserved keyword is assumed to be an identifier.
All identifiers with two leading underscores are reserved and should be considered internal.

Operators and punctuation

The following character sequences represent operators and punctuation:

+    &    &&    ==    |>    (    )
-    |    ||    !=    =     {    }
*    ^          <     !     [    ]
/    <<         <=          .    ;
%    >>         >           ,    :
                >=

Literals

Integer literals

An integer literal is a sequence of digits. Depending on the prefix, the literal is interpreted as base 2, 8, 10 or 16.

Prefix Base
0b binary
0o octal
0x hexadecimal
none decimal
binary_literal   =  0b[01]+       // base 2
octal_literal    =  0o[0-7]+      // base 8
decimal_literal  =  [0-9]+        // base 10
hex_literal      =  0x[0-9a-f]+   // base 16

Floating-point literals

TODO

Character literal

A character literal is a valid ascii codepoint, enclosed by single quotes. TODO

String literals

A string literal is a sequence of valid ascii codepoints enclosed by double quotes.
There are multiple kinds of string literals:

The b and r prefixes can be combined to create a raw bytestring.

Literal Characters Type
"a\nb" a, \n, b *String
r"a\nb" a, \, n, b *String
b"a\nb" a, \n, b *i8
br"a\nb" a, \, n, b *i8

Types

Primitive types

Yo defines the following primitive types:

Typename Size (bytes) Description Values
void 0 the void type n/a
u{N} N/8 unsigned integer type 0 ... 2^N-1
i{N} N/8 signed integer type -2^(N-1) ... 2^(N-1)-1
bool 1 the boolean type true, false
f64 8 IEEE-754 binary64 see wikipedia

Integer types

For integer types u{N} and i{N}, valid sizes are: N = 8, 16, 32, 64.

An integer type’s signedness is indicated by its prefix: i8 is a signed integer, u8 an unsigned integer.

Pointer types

Prefixing a base type T with a star yields a pointer type *T

A pointer type’s base type must be of size > 0. Yo’s equivalent of a C void * is *i8.

Function types

A function type represents all functions with the same parameter and result types:

        () -> void  // a function that has no parameters and returns nothing
(i32, i32) -> i64   // a function that takes two `i32` values and returns an `i64` value

A function type (ie, a function’s signature) only contains the types of the parameter and return types, it does not contain the names of the individual parameters or any attributes the actual function declaration might have.

decltype

decltype(<expr>)

The decltype construct can be used whenever the compiler would expect a type. It takes a single argument - an expression - and yields the type that expression would evaluate to. The expression is not evaluated.

decltype is useful in situations where it would otherwise be difficult or impossible to declare a type, for example when dealing with types that depend on template parameters.

Example

fn add<T, U>(x: T, y: U) -> decltype(x + y) {
    return x + y;
}

Functions

Function declaration

A function is declared using the fn keyword. A function declaration consists of:

A function’s return type may be omitted, in which case it defaults to void.

Example

// A simple function declaration
fn add(x: i64, y: i64) -> i64 {
    return x + y;
}

Function template

In the case of a function template declaration, the template parameter names are listed in angled brackets, immediately prior to the function’s parameter list.

Example

// The identity function
fn id<T>(arg: T) -> T {
    return arg;
}

// The add function from above, as a function template
fn add<T>(x: T, y: T) -> T {
    return x + y;
}

Operator declaration

Some binary operators can be overloaded for a specific signature. Operator overloads are declared as functions with the name operator, followed by the operator being overloaded.

fn operator + (x: Foo, y: Foo) -> Foo {
    // some custom addition logic
}

The following operators can be overloaded:

+    &     &&    ==
-    |     ||    !=
*    ^           <
/    <<          >
%    >>          <=
                 >=

Note When overloading equality operators, in most cases overloading just the == operator is sufficient, since the default implementation of the != operator is essentially just a forward to ==.

Overload resolution

When generating code for a function call, the compiler will collect a set of potential target for the call. From that set, the overload most closely matching the supplied arguments will be selected, based on a scoring system. A tie (ie, two or more equally likely targets) will result in a compile-time error.

Structs

Struct declaration

Custom types can be defined using the struct keyword. All struct types are uniquely identified by their name. A struct type can have properties and a set of member functions (methods) associated with it. Member functions are declared in one or multiple impl blocks.

Note For the time being, structs are always allocated on the heap

Example Declaring a struct with properties and member functions

struct Person {
    name: String,
    age: i8
}

impl Person {
    // no `self` parameter -> static method
    fn me() -> *Person {
        return Person::init("Lukas", 20);
    }

    // `self` parameter -> instance method
    fn increaseAge(self: *Person) {
        self.age += 1;
    }
}

Expressions

Every expression evaluates to a value of a specific type, which must be known at compile time.

Literals

Literal Type Example
Integer literal i64 12
Floating point literal f64 12.0
Character literal i8 'a'
String literal *String "text"
String literal (bytestring) *i8 b"text"

Operators

Note Since most of the binary operators above are implemented as functions, they can be overloaded (see yo.decl.fn.operator)

Type conversions

All type conversions are required to be explicit: Attempting to pass an i64 to a function that expects an u64 will result in a compilation error.
The only exception to this rule is numeric literals: Even though numeric literals by default evaluate to values of type i64, you may use a literal in an expression that is expected to be of a different numeric type, and the compiler will implicitly cast the literal.

static_cast

#[intrinsic]
fn static_cast<R, T>(arg: T) -> R;

The static_cast intrinsic converts an expression of type T to a related type R, if there is a known conversion from T to R. It will fail at compile-time if there is no known conversion.

reinterpret_cast

#[intrinsic]
fn reinterpret_cast<R, T>(arg: T) -> R;

The reinterpret_cast intrinsic converts between any two types T and R, by reinterpreting the value’s bit pattern. T and R are required to have the exact same bit width, otherwise it will fail at compile-time.

Example

fn foo() -> i32 {
    let x = 0; // x has the deduced type i64
    return x;  // this will fail since the function is expected to return an i64
}

fn bar() -> i32 {
    return 0; // this will work fine since the compiler is allowed to insert an implicit static_cast<i32>
}

Attributes

Attributes can be used to provide the compiler with additional knowledge about a declaration.

An attribute list is declared using the #[<attr>, <attr>, ...] syntax. A declaration that can have attributes can be preceded by one or multiple attribute lists. Splitting multiple attributes up into multiple separate attribute lists is semantically equivalent to putting them all in a single list.

Note Specifying the same attribute multiple times with different values is considered undefined behaviour.

Attribute Types

Function Attributes

Name Type Description
extern bool C linkage
inline bool Function may be inlined
always_inline bool Function should always be inlined
intrinsic bool (internal) declares a compile-time intrinsic
no_mangle bool Don’t mangle the function’s name
mangle string Override a function’s mangled name
startup bool Causes the function to be called before execution enters main
shutdown bool Causes the function to be called after main returns

Note

Example

// Forward-declaring a function with external C linkage.
#[extern]
fn strcmp(*i8, *i8) -> i32;

// A function with an explicitly set mangled name
#[mangle="bar"]
fn foo() -> void { ... }

Struct Attributes

Name Type Description
no_init bool The compiler should not generate a default initializer for the type

Intrinsics

A function declared with the intrinsic attribute is considered a compile-time intrinsic. Calls to intrinsic functions will receive special handling by the compiler. All intrinsic functions are declared in the :runtime/intrinsics module.

An intrinsic function may be overloaded with a custom implementation, in this case the overload must not declare the intrinsic attribute.

Templates

Templates provide a way to declare a generic implementation of a struct or function.

Templates don’t exist “on their own”: No code is generated when you only declare, but never use a template.
When the compiler encounters an instantiation of a struct template or a call to a function template, it generates a specialized version for the supplied generic arguments.

// A function template
fn add<T>(x: T, y: T) -> T {
    return x + y;
}

Function specializations can be declared simply by overloadding the function for a specific signature.

Memory Management

Yo currently doesn’t have garbage collection / automatic reference counting.

The :runtime/memory module declares some functions and intrinsics related to memory management: