Expand description

This module contains the bulk of the interesting code performing the translation between WebAssembly bytecode and Cranelift IR.

The translation is done in one pass, opcode by opcode. Two main data structures are used during code translations: the value stack and the control stack. The value stack mimics the execution of the WebAssembly stack machine: each instruction result is pushed onto the stack and instruction arguments are popped off the stack. Similarly, when encountering a control flow block, it is pushed onto the control stack and popped off when encountering the corresponding End.

Another data structure, the translation state, records information concerning unreachable code status and about if inserting a return at the end of the function is necessary.

Some of the WebAssembly instructions need information about the environment for which they are being translated:

  • the loads and stores need the memory base address;
  • the get_global and set_global instructions depend on how the globals are implemented;
  • memory.size and memory.grow are runtime functions;
  • call_indirect has to translate the function index into the address of where this is;

That is why translate_function_body takes an object having the WasmRuntime trait as argument.

There is extra complexity associated with translation of 128-bit SIMD instructions. Wasm only considers there to be a single 128-bit vector type. But CLIF’s type system distinguishes different lane configurations, so considers 8X16, 16X8, 32X4 and 64X2 to be different types. The result is that, in wasm, it’s perfectly OK to take the output of (eg) an add.16x8 and use that as an operand of a sub.32x4, without using any cast. But when translated into CLIF, that will cause a verifier error due to the apparent type mismatch.

This file works around that problem by liberally inserting bitcast instructions in many places – mostly, before the use of vector values, either as arguments to CLIF instructions or as block actual parameters. These are no-op casts which nevertheless have different input and output types, and are used (mostly) to “convert” 16X8, 32X4 and 64X2-typed vectors to the “canonical” type, 8X16. Hence the functions optionally_bitcast_vector, bitcast_arguments, pop*_with_bitcast, canonicalise_then_jump, canonicalise_then_br{z,nz}, is_non_canonical_v128 and canonicalise_v128_values. Note that the bitcast* functions are occasionally used to convert to some type other than 8X16, but the canonicalise* functions always convert to type 8X16.

Be careful when adding support for new vector instructions. And when adding new jumps, even if they are apparently don’t have any connection to vectors. Never generate any kind of (inter-block) jump directly. Instead use canonicalise_then_jump and canonicalise_then_br{z,nz}.

The use of bitcasts is ugly and inefficient, but currently unavoidable:

  • they make the logic in this file fragile: miss out a bitcast for any reason, and there is the risk of the system failing in the verifier. At least for debug builds.

  • in the new backends, they potentially interfere with pattern matching on CLIF – the patterns need to take into account the presence of bitcast nodes.

  • in the new backends, they get translated into machine-level vector-register-copy instructions, none of which are actually necessary. We then depend on the register allocator to coalesce them all out.

  • they increase the total number of CLIF nodes that have to be processed, hence slowing down the compilation pipeline. Also, the extra coalescing work generates a slowdown.

A better solution which would avoid all four problems would be to remove the 8X16, 16X8, 32X4 and 64X2 types from CLIF and instead have a single V128 type.

For further background see also: https://github.com/bytecodealliance/wasmtime/issues/1147 (“Too many raw_bitcasts in SIMD code”) https://github.com/bytecodealliance/cranelift/pull/1251 (“Add X128 type to represent WebAssembly’s V128 type”) https://github.com/bytecodealliance/cranelift/pull/1236 (“Relax verification to allow I8X16 to act as a default vector type”)

Modules§

  • Implementation of Wasm to CLIF memory access translation.

Macros§

Enums§

  • Like Option<T> but specifically for passing information about transitions from reachable to unreachable state and the like from callees to callers.

Functions§

  • Like bitcast_wasm_returns, but for the parameters being passed to a specified callee.
  • A helper for bitcasting a sequence of return values for the function currently being built. If a value is a vector type that does not match its expected type, this will modify the value in place to point to the result of a bitcast. This conversion is necessary to translate Wasm code that uses V128 as function parameters (or implicitly in block parameters) and still use specific CLIF types (e.g. I32X4) in the function body.
  • The same but for a brif instruction.
  • Generate a jump instruction, but first cast all 128-bit vector values to I8X16 if they don’t have that type. This is done in somewhat roundabout way so as to ensure that we almost never have to do any heap allocation.
  • Cast to I8X16, any vector values in values that are of “non-canonical” type (meaning, not I8X16), and return them in a slice. A pre-scan is made to determine whether any casts are actually necessary, and if not, the original slice is returned. Otherwise the cast values are returned in a slice that belongs to the caller-supplied SmallVec.
  • Some SIMD operations only operate on I8X16 in CLIF; this will convert them to that type by adding a raw_bitcast if necessary.
  • A helper for popping and bitcasting a single value; since SIMD values can lose their type by using v128 (i.e. CLIF’s I8x16) we must re-type the values using a bitcast to avoid CLIF typing issues.
  • A helper for popping and bitcasting two values; since SIMD values can lose their type by using v128 (i.e. CLIF’s I8x16) we must re-type the values using a bitcast to avoid CLIF typing issues.
  • This function is a generalized helper for validating that a wasm-supplied heap address is in-bounds.
  • Like prepare_addr but for atomic accesses.
  • Translate a load instruction.
  • Translates wasm operators into Cranelift IR instructions. Returns true if it inserted a return.
  • Translate a store instruction.
  • Deals with a Wasm instruction located in an unreachable portion of the code. Most of them are dropped but special ones like End or Else signal the potential end of the unreachable portion so the translation state must be updated accordingly.
  • type_of 🔒
    Determine the returned value type of a WebAssembly operator