wasmer_wasix::state

Module linker

Source
Expand description

Linker for loading and linking dynamic modules at runtime. The linker is designed to work with output from clang (version 19 was used at the time of creating this code). Note that dynamic linking of WASM modules is considered unstable in clang/LLVM, so this code may need to be updated for future versions of clang.

The linker doesn’t care about where code exists and how modules call each other, but the way we have found to be most effective is: * The main module carries with it all of wasix-libc, and exports everything * Side module don’t link wasix-libc in, instead importing it from the main module

This way, we only need one instance of wasix-libc, and one instance of all the static data that it requires to function. Indeed, if there were multiple instances of its static data, it would more than likely just break completely; one needs only imagine what would happen if there were multiple memory allocators (malloc) running at the same time. Emscripten (the only WASM runtime that supports dynamic linking, at the time of this writing) takes the same approach.

While locating modules by relative or absolute paths is possible, it is recommended to put every side module into /lib, where they can be located by name as well as by path.

The linker starts from a dynamically-linked main module. It scans the dylink.0 section for memory and table-related information and the list of needed modules. The module tree requires a memory, an indirect function table, and stack-related parameters (including the __stack_pointer global), which are created. Since dynamically-linked modules use PIC (position-independent code), the stack is not fixed and can be resized at runtime.

After the memory, function table and stack are created, the linker proceeds to load in needed modules. Needed modules are always loaded in and initialized before modules that asked for them, since it is expected that the needed module needs to be usable before the module that needs it can be initialized.

However, we also need to support circular dependencies between the modules; the most common case is when the main needs a side module and imports function from it, and the side imports wasix-libc functions from the main. To support this, the linker generates stub functions for all the imports that cannot be resolved when a module is being loaded in. The stub functions will then resolve the function once (and only once) at runtime when they’re first called. This does, however, mean that link errors can happen at runtime, after the linker has reported successful linking of the modules. Such errors are turned into a WasiError::DlSymbolResolutionFailed error and will terminate execution completely.

§Threading Support

The linker supports the concept of “Instance Groups”, which are multiple instances of the same module tree. This corresponds very closely to WASIX threads, but is named an instance group so as to keep the logic decoupled from the threading logic in WASIX.

Each instance group has its own store, indirect function table, and stack pointer, but shares its memory with every other instance group. Note that even though the underlying memory is the same, we need to create a new Memory instance for each group via Memory::share_in_store. Also, when placing a symbol in the function table, the linker always updates all function tables at the same time. This is because function “pointers” can be passed across instance groups (read: sent to other threads) by the guest code, so all function tables should have exactly the same content at all times.

One important aspect of instance groups is that they do not share the same store; this lets us put different instance groups on different OS threads. However, this also means that one call to Linker::load_module, etc. cannot update every instance group as each one has its own function table. To make the linker work across threads, we need a “stop-the-world” lock on every instance group. The group the load/resolve request originates from sets a flag, which other instance groups are required to check periodically by calling Linker::do_pending_link_operations. Once all instance groups are stopped in that function, the original can proceed to perform the operation, and report its results to all other instance groups so they can make the same changes to their function table as well.

In WASIX, the periodic check is performed at the start of most (but not all) syscalls. This means a thread that doesn’t make any syscalls can potentially block all other threads if a DL operation is performed. This also means that two instance groups cannot co-exist on the same OS thread, as the first one will block the OS thread and the second can’t enter the “lock” again to let the first continue its work.

To also get cooperation from threads that are waiting in a syscall, a Signal::Sigwakeup signal is sent to all threads when a DL operation needs to be synchronized.

§About TLS

Each instance of each group gets its own TLS area, so there are 4 cases to consider: * Main instance of main module: TLS area will be allocated by the compiler, and be placed at the start of the memory region requested by the dylink.0 section. * Main instance of side modules: Almost same as main module, but tls_base will be non-zero because side modules get a non-zero memory_base. It is very important to note that the main instance of a side module lives in the instance group that initially loads it in. This does not have to be the main instance group. * Other instances of main module: Each worker thread gets its TLS area allocated by the code in pthread_create, and a pointer to the TLS area is passed through the thread start args. This pointer is read by the code in thread_spawn, and passed through to us as part of the environment’s memory layout. * Other instances of side modules: This is where the linker comes in. When the new instance is created, the linker will call its __wasix_init_tls function, which is responsible for setting up the TLS area for the thread.

Since we only want to call __wasix_init_tls for non-main instances of side modules, it is enough to call it only within InstanceGroupState::instantiate_side_module_from_linker.

§Module Loading

Module loading happens as an orchestrated effort between the shared linker state, the state of the instance group that started (or “instigated”) the operation, and other instance groups. Access to a set of instances is required for resolution of exports, which is why the linker state alone (which only stores modules) is not enough.

Even though most (if not all) operations require access to both the shared linker state and a/the instance group state, they’re separated into three sets: * Operations that deal with metadata exist as impls on LinkerState. These take a (read-only) instance group state for export resolution, as well as a StoreRef. They’re guaranteed not to alter the store or the instance group state. * Operations that deal with the actual instances (instantiating, putting symbols in the function table, etc.) and are started by the instigating group exist as impls on InstanceGroupState that also take a mutable reference to the shared linker state, and require it to be locked for writing. These operations can and will update the linker state, mainly to store symbol resolution records. * Operations that deal with replicating changes to instances from another thread also exits as impls on InstanceGroupState, but take a read-only reference to the shared linker state. This is important because all the information needed for replicating the change to the instigating group’s instances should already be in the linker state. See InstanceGroupState::populate_imports_from_linker and InstanceGroupState::instantiate_side_module_from_linker for the two most important ones.

Module loading generally works by going through these steps: * LinkerState::load_module_tree loads modules (and their needed modules) and assigns module handles * Then, for each new module: * Memory and table space is allocated * Imports are resolved (see next section) * The module is instantiated * After all modules have been instantiated, pending imports (resulting from circular dependencies) are resolved * Finally, module initializers are called

§Symbol resolution

To support replicating operations from the instigating group to other groups, symbol resolution happens in 3 steps: * LinkerState::resolve_symbols goes through the imports of a soon-to-be-loaded module, recording the imports as NeededSymbolResolutionKeys and creating InProgressSymbolResolutions in response to each one. * InstanceGroupState::populate_imports_from_link_state then goes through the results and resolves each import to its final value, while also recording enough information (in the shape of SymbolResolutionResults) for other groups to resolve the symbol from their own instances. * Finally, instances are created and finalized, and initializers are called.

§Stub functions

As noted above, stub functions are generated in response to circular dependencies. The stub functions do take previous symbol resolution records into account, so that the stub corresponding to a single import cannot resolve to different exports in different groups. If no such record is found, then a new record is created by the stub function. However, there’s a catch.

It must be noted that, during initialization, the shared linker state has to remain write-locked so as to prevent other threads from starting another operation (the replication logic only works with one active operation at a time). Stub functions need a write lock on the shared linker state to store new resolution records, and as such, they can’t store resolution records if they’re called in response to a module’s initialization routines. This can happen easily if:

  • A side module is needed by the main
  • That side module accesses any libc functions, such as printing something to stdout.

To work around this, stub functions only try to lock the shared linker state, and if they can’t, they won’t store anything. A follow-up call to the stub function can resolve the symbol again, store it for use by further calls to the function, and also create a resolution record. This does create a few hard-to-reach edge cases: * If the symbol happens to resolve differently between the two calls to the stub, unpredictable behavior can happen; however, this is impossible in the current implementation. * If the shared state is locked by a different instance group, then the stub won’t store its lookup results anyway, even though it could have if it had waited.

§Locating side modules

Side modules are located according to these steps: * If the name contains a slash (/), it is treated as a relative or absolute path.
* Otherwise, the name is searched for in /lib, /usr/lib and /usr/local/lib. LD_LIBRARY_PATH is not supported yet.

§Building dynamically-linked modules

Note that building modules that conform the specific requirements of this linker requires careful configuration of clang. A PIC sysroot is required. The steps to build a main module are:

clang-19 \
  --target=wasm32-wasi --sysroot=/path/to/sysroot32-pic \
  -matomics -mbulk-memory -mmutable-globals -pthread \
  -mthread-model posix -ftls-model=local-exec \
  -fno-trapping-math -D_WASI_EMULATED_MMAN -D_WASI_EMULATED_SIGNAL \
  -D_WASI_EMULATED_PROCESS_CLOCKS \
  # PIC is required for all modules, main and side
  -fPIC \
  # We need to compile to an object file we can manually link in the next step
  -c main.c -o main.o

wasm-ld-19 \
  # To link needed side modules, assuming `libsidewasm.so` exists in the current directory:
  -L. -lsidewasm \
  -L/path/to/sysroot32-pic/lib \
  -L/path/to/sysroot32-pic/lib/wasm32-wasi \
  # Make wasm-ld search everywhere and export everything, needed for wasix-libc functions to
  # be exported correctly from the main module
  --whole-archive --export-all \
  # The object file from the last step
  main.o \
  # The crt1.o file contains the _start and _main_void functions
  /path/to/sysroot32-pic/lib/wasm32-wasi/crt1.o \
  # Statically link the sysroot's libraries
  -lc -lresolv -lrt -lm -lpthread -lwasi-emulated-mman \
  # The usual linker config for wasix modules
  --import-memory --shared-memory --extra-features=atomics,bulk-memory,mutable-globals \
  --export=__wasm_signal --export=__tls_size --export=__tls_align \
  --export=__tls_base --export=__wasm_call_ctors --export-if-defined=__wasm_apply_data_relocs \
  # Again, PIC is very important, as well as producing a location-independent executable with -pie
  --experimental-pic -pie \
  -o main.wasm

And the steps to build a side module are:

clang-19 \
  --target=wasm32-wasi --sysroot=/path/to/sysroot32-pic \
  -matomics -mbulk-memory -mmutable-globals -pthread \
  -mthread-model posix -ftls-model=local-exec \
  -fno-trapping-math -D_WASI_EMULATED_MMAN -D_WASI_EMULATED_SIGNAL \
  -D_WASI_EMULATED_PROCESS_CLOCKS \
  # We need PIC
  -fPIC \
  # Make it export everything that's not hidden explicitly
  -fvisibility=default \
  -c side.c -o side.o

wasm-ld-19 \
  # Note: we don't link against wasix-libc, so no -lc etc., because we want
  # those symbols to be imported.
  --extra-features=atomics,bulk-memory,mutable-globals \
  --export=__wasm_call_ctors --export-if-defined=__wasm_apply_data_relocs \
  # Need PIC
  --experimental-pic \
  # Import everything that's undefined, including wasix-libc functions
  --unresolved-symbols=import-dynamic \
  # build a shared library
  -shared \
  # Import a shared memory
  --shared-memory \
  # Conform to the libxxx.so naming so clang can find it via -lxxx
  -o libsidewasm.so side.o

Macros§

Structs§

Enums§

Constants§

Statics§

Functions§