Expand description
This module contains the bulk of the interesting code performing the translation between WebAssembly bytecode and Cranelift IR.
The translation is done in one pass, opcode by opcode. Two main data structures are used during
code translations: the value stack and the control stack. The value stack mimics the execution
of the WebAssembly stack machine: each instruction result is pushed onto the stack and
instruction arguments are popped off the stack. Similarly, when encountering a control flow
block, it is pushed onto the control stack and popped off when encountering the corresponding
End
.
Another data structure, the translation state, records information concerning unreachable code status and about if inserting a return at the end of the function is necessary.
Some of the WebAssembly instructions need information about the environment for which they are being translated:
- the loads and stores need the memory base address;
- the
get_global
andset_global
instructions depend on how the globals are implemented; memory.size
andmemory.grow
are runtime functions;call_indirect
has to translate the function index into the address of where this is;
That is why translate_function_body
takes an object having the WasmRuntime
trait as
argument.
There is extra complexity associated with translation of 128-bit SIMD instructions.
Wasm only considers there to be a single 128-bit vector type. But CLIF’s type system
distinguishes different lane configurations, so considers 8X16, 16X8, 32X4 and 64X2 to be
different types. The result is that, in wasm, it’s perfectly OK to take the output of (eg)
an add.16x8
and use that as an operand of a sub.32x4
, without using any cast. But when
translated into CLIF, that will cause a verifier error due to the apparent type mismatch.
This file works around that problem by liberally inserting bitcast
instructions in many
places – mostly, before the use of vector values, either as arguments to CLIF instructions
or as block actual parameters. These are no-op casts which nevertheless have different
input and output types, and are used (mostly) to “convert” 16X8, 32X4 and 64X2-typed vectors
to the “canonical” type, 8X16. Hence the functions optionally_bitcast_vector
,
bitcast_arguments
, pop*_with_bitcast
, canonicalise_then_jump
,
canonicalise_then_br{z,nz}
, is_non_canonical_v128
and canonicalise_v128_values
.
Note that the bitcast*
functions are occasionally used to convert to some type other than
8X16, but the canonicalise*
functions always convert to type 8X16.
Be careful when adding support for new vector instructions. And when adding new jumps, even
if they are apparently don’t have any connection to vectors. Never generate any kind of
(inter-block) jump directly. Instead use canonicalise_then_jump
and
canonicalise_then_br{z,nz}
.
The use of bitcasts is ugly and inefficient, but currently unavoidable:
-
they make the logic in this file fragile: miss out a bitcast for any reason, and there is the risk of the system failing in the verifier. At least for debug builds.
-
in the new backends, they potentially interfere with pattern matching on CLIF – the patterns need to take into account the presence of bitcast nodes.
-
in the new backends, they get translated into machine-level vector-register-copy instructions, none of which are actually necessary. We then depend on the register allocator to coalesce them all out.
-
they increase the total number of CLIF nodes that have to be processed, hence slowing down the compilation pipeline. Also, the extra coalescing work generates a slowdown.
A better solution which would avoid all four problems would be to remove the 8X16, 16X8, 32X4 and 64X2 types from CLIF and instead have a single V128 type.
For further background see also: https://github.com/bytecodealliance/wasmtime/issues/1147 (“Too many raw_bitcasts in SIMD code”) https://github.com/bytecodealliance/cranelift/pull/1251 (“Add X128 type to represent WebAssembly’s V128 type”) https://github.com/bytecodealliance/cranelift/pull/1236 (“Relax verification to allow I8X16 to act as a default vector type”)
Modules§
- Implementation of Wasm to CLIF memory access translation.
Macros§
- Given a
Reachability<T>
, unwrap the innerT
or, when unreachable, setstate.reachable = false
and return.
Enums§
- Like
Option<T>
but specifically for passing information about transitions from reachable to unreachable state and the like from callees to callers.
Functions§
- Like
bitcast_wasm_returns
, but for the parameters being passed to a specified callee. - A helper for bitcasting a sequence of return values for the function currently being built. If a value is a vector type that does not match its expected type, this will modify the value in place to point to the result of a
bitcast
. This conversion is necessary to translate Wasm code that usesV128
as function parameters (or implicitly in block parameters) and still use specific CLIF types (e.g.I32X4
) in the function body. - The same but for a
brif
instruction. - Generate a
jump
instruction, but first cast all 128-bit vector values to I8X16 if they don’t have that type. This is done in somewhat roundabout way so as to ensure that we almost never have to do any heap allocation. - Cast to I8X16, any vector values in
values
that are of “non-canonical” type (meaning, not I8X16), and return them in a slice. A pre-scan is made to determine whether any casts are actually necessary, and if not, the original slice is returned. Otherwise the cast values are returned in a slice that belongs to the caller-suppliedSmallVec
. - Some SIMD operations only operate on I8X16 in CLIF; this will convert them to that type by adding a raw_bitcast if necessary.
- A helper for popping and bitcasting a single value; since SIMD values can lose their type by using v128 (i.e. CLIF’s I8x16) we must re-type the values using a bitcast to avoid CLIF typing issues.
- A helper for popping and bitcasting two values; since SIMD values can lose their type by using v128 (i.e. CLIF’s I8x16) we must re-type the values using a bitcast to avoid CLIF typing issues.
- This function is a generalized helper for validating that a wasm-supplied heap address is in-bounds.
- Like
prepare_addr
but for atomic accesses. - Translate a load instruction.
- Translates wasm operators into Cranelift IR instructions. Returns
true
if it inserted a return. - Translate a store instruction.
- Deals with a Wasm instruction located in an unreachable portion of the code. Most of them are dropped but special ones like
End
orElse
signal the potential end of the unreachable portion so the translation state must be updated accordingly. - type_of 🔒Determine the returned value type of a WebAssembly operator