Expand description
This module contains the bulk of the interesting code performing the translation between WebAssembly bytecode and Cranelift IR.
The translation is done in one pass, opcode by opcode. Two main data structures are used during
code translations: the value stack and the control stack. The value stack mimics the execution
of the WebAssembly stack machine: each instruction result is pushed onto the stack and
instruction arguments are popped off the stack. Similarly, when encountering a control flow
block, it is pushed onto the control stack and popped off when encountering the corresponding
End
.
Another data structure, the translation state, records information concerning unreachable code status and about if inserting a return at the end of the function is necessary.
Some of the WebAssembly instructions need information about the environment for which they are being translated:
- the loads and stores need the memory base address;
- the
get_global
andset_global
instructions depend on how the globals are implemented; memory.size
andmemory.grow
are runtime functions;call_indirect
has to translate the function index into the address of where this is;
That is why translate_function_body
takes an object having the WasmRuntime
trait as
argument.
There is extra complexity associated with translation of 128-bit SIMD instructions.
Wasm only considers there to be a single 128-bit vector type. But CLIF’s type system
distinguishes different lane configurations, so considers 8X16, 16X8, 32X4 and 64X2 to be
different types. The result is that, in wasm, it’s perfectly OK to take the output of (eg)
an add.16x8
and use that as an operand of a sub.32x4
, without using any cast. But when
translated into CLIF, that will cause a verifier error due to the apparent type mismatch.
This file works around that problem by liberally inserting bitcast
instructions in many
places – mostly, before the use of vector values, either as arguments to CLIF instructions
or as block actual parameters. These are no-op casts which nevertheless have different
input and output types, and are used (mostly) to “convert” 16X8, 32X4 and 64X2-typed vectors
to the “canonical” type, 8X16. Hence the functions optionally_bitcast_vector
,
bitcast_arguments
, pop*_with_bitcast
, canonicalise_then_jump
,
canonicalise_then_br{z,nz}
, is_non_canonical_v128
and canonicalise_v128_values
.
Note that the bitcast*
functions are occasionally used to convert to some type other than
8X16, but the canonicalise*
functions always convert to type 8X16.
Be careful when adding support for new vector instructions. And when adding new jumps, even
if they are apparently don’t have any connection to vectors. Never generate any kind of
(inter-block) jump directly. Instead use canonicalise_then_jump
and
canonicalise_then_br{z,nz}
.
The use of bitcasts is ugly and inefficient, but currently unavoidable:
-
they make the logic in this file fragile: miss out a bitcast for any reason, and there is the risk of the system failing in the verifier. At least for debug builds.
-
in the new backends, they potentially interfere with pattern matching on CLIF – the patterns need to take into account the presence of bitcast nodes.
-
in the new backends, they get translated into machine-level vector-register-copy instructions, none of which are actually necessary. We then depend on the register allocator to coalesce them all out.
-
they increase the total number of CLIF nodes that have to be processed, hence slowing down the compilation pipeline. Also, the extra coalescing work generates a slowdown.
A better solution which would avoid all four problems would be to remove the 8X16, 16X8, 32X4 and 64X2 types from CLIF and instead have a single V128 type.
For further background see also: https://github.com/bytecodealliance/wasmtime/issues/1147 (“Too many raw_bitcasts in SIMD code”) https://github.com/bytecodealliance/cranelift/pull/1251 (“Add X128 type to represent WebAssembly’s V128 type”) https://github.com/bytecodealliance/cranelift/pull/1236 (“Relax verification to allow I8X16 to act as a default vector type”)
Modules§
- bounds_
checks 🔒 - Implementation of Wasm to CLIF memory access translation.
Macros§
- unwrap_
or_ 🔒return_ unreachable_ state - Given a
Reachability<T>
, unwrap the innerT
or, when unreachable, setstate.reachable = false
and return.
Enums§
- Reachability
- Like
Option<T>
but specifically for passing information about transitions from reachable to unreachable state and the like from callees to callers.
Functions§
- align_
atomic_ 🔒addr - bitcast_
arguments - bitcast_
wasm_ params - Like
bitcast_wasm_returns
, but for the parameters being passed to a specified callee. - bitcast_
wasm_ returns - A helper for bitcasting a sequence of return values for the function currently being built. If
a value is a vector type that does not match its expected type, this will modify the value in
place to point to the result of a
bitcast
. This conversion is necessary to translate Wasm code that usesV128
as function parameters (or implicitly in block parameters) and still use specific CLIF types (e.g.I32X4
) in the function body. - canonicalise_
brif 🔒 - The same but for a
brif
instruction. - canonicalise_
then_ 🔒jump - Generate a
jump
instruction, but first cast all 128-bit vector values to I8X16 if they don’t have that type. This is done in somewhat roundabout way so as to ensure that we almost never have to do any heap allocation. - canonicalise_
v128_ 🔒values - Cast to I8X16, any vector values in
values
that are of “non-canonical” type (meaning, not I8X16), and return them in a slice. A pre-scan is made to determine whether any casts are actually necessary, and if not, the original slice is returned. Otherwise the cast values are returned in a slice that belongs to the caller-suppliedSmallVec
. - fold_
atomic_ 🔒mem_ addr - is_
non_ 🔒canonical_ v128 - mem_
op_ 🔒size - optionally_
bitcast_ 🔒vector - Some SIMD operations only operate on I8X16 in CLIF; this will convert them to that type by adding a raw_bitcast if necessary.
- pop1_
with_ 🔒bitcast - A helper for popping and bitcasting a single value; since SIMD values can lose their type by using v128 (i.e. CLIF’s I8x16) we must re-type the values using a bitcast to avoid CLIF typing issues.
- pop2_
with_ 🔒bitcast - A helper for popping and bitcasting two values; since SIMD values can lose their type by using v128 (i.e. CLIF’s I8x16) we must re-type the values using a bitcast to avoid CLIF typing issues.
- prepare_
addr 🔒 - This function is a generalized helper for validating that a wasm-supplied heap address is in-bounds.
- prepare_
atomic_ 🔒addr - Like
prepare_addr
but for atomic accesses. - translate_
atomic_ 🔒cas - translate_
atomic_ 🔒load - translate_
atomic_ 🔒rmw - translate_
atomic_ 🔒store - translate_
br_ 🔒if - translate_
br_ 🔒if_ args - translate_
fcmp 🔒 - translate_
icmp 🔒 - translate_
load 🔒 - Translate a load instruction.
- translate_
operator - Translates wasm operators into Cranelift IR instructions. Returns
true
if it inserted a return. - translate_
store 🔒 - Translate a store instruction.
- translate_
unreachable_ 🔒operator - Deals with a Wasm instruction located in an unreachable portion of the code. Most of them
are dropped but special ones like
End
orElse
signal the potential end of the unreachable portion so the translation state must be updated accordingly. - translate_
vector_ 🔒fcmp - translate_
vector_ 🔒icmp - type_of 🔒
- Determine the returned value type of a WebAssembly operator