Method
GumStalkerprefetch
Declaration [src]
void
gum_stalker_prefetch (
GumStalker* self,
gconstpointer address,
gint recycle_count
)
Description [src]
This API is intended for use during fuzzing scenarios such as AFL forkserver. It allows for the child to feed back the addresses of instrumented blocks to the parent so that the next time a child is forked from the parent, it will already inherit the instrumented block rather than having to re-instrument every basic block again from scratch.
This API has the following caveats:
-
This API MUST be called from the thread which will be executed in the child. Since blocks are cached in the GumExecCtx which is stored on a per-thread basis and accessed through Thread Local Storage, it is not possible to prefetch blocks into the cache of another thread.
-
This API should be called after gum_stalker_follow_me(). It is likely that the parent will wish to call
gum_stalker_deactivate()immediately after following. Subsequently,gum_stalker_activate()can be called within the child after it is forked to start stalking the thread once more. The child can then communicate newly discovered basic blocks back to the parent via inter-process communications. The parent can then callgum_stalker_prefetch()to instrument those blocks before forking the next child. As a result of the fork, the child inherits a deactivated Stalker instance, thus both parent and child should release their Stalker instances upon completion if required. -
Note that
gum_stalker_activate()takes atargetpointer which is used to allow Stalker to be reactivated whilst executing in an excluded range and guarantee that the thread is followed until the “activation target” address is reached. Typically for e.g. a fuzzer the target would be the function you’re about to hit with inputs. When this target isn’t known, the simplest solution to this is to define an empty function (marked as non-inlineable) and then subsequently call it immediately after activation to return Stalker to its normal behavior. It is important thattargetis at the start of a basic block, otherwise Stalker will not detect it. Failure to do so may mean that Stalker continues to follow the thread into code which it should not, including any calls to Stalker itself. Thus care should be taken to ensure that the function is not inlined, or optimized away by the compiler.
attribute ((noinline)) static void activation_target (void) { // Avoid calls being optimized out asm (“”); }
-
Note that since both parent and child have an identical Stalker instance, they each have the exact same Transformer. Since this Transformer will be used both to generate blocks to execute in the child and to prefetch blocks in the parent, care should be taken to identify in which scenario the transformer is operating. The parent will likely also transform and execute a few blocks even if it is deactivated immediately afterwards. Thus care should also be taken when any callouts are executed to determine whether they are running in the parent or child context.
-
For optimal performance, the recycle_count should be set to the same value as gum_stalker_get_trust_threshold(). Unless the trust threshold is set to
-1or0. When adding instrumented blocks into the cache, Stalker also retains a copy of the original bytes of the code which was instrumented. When recalling blocks from the cache, this is compared in order to detect self-modifying code. If the block is the same, then the recycle_count is incremented. The trust threshold sets the limit of how many times a block should be identical (e.g. the code has not been modified) before this comparison can be omitted. Thus when prefetching, we can also set the recycle_count to control whether this comparison takes place. When the trust threshold is less than1, the block_recycle count has not effect. -
This API does not change the trust threshold as it is a global setting which affects all Stalker sessions running on all threads.
-
It is inadvisable to prefetch self-modifying code blocks, since it will mean a single static instrumented block will always be used when it is executed. The detection of self-modifying code in the child is left to the user, just as the user is free to choose which blocks to prefetch by calling the API. It may also be helpful to avoid sending the same block address to be prefetched to the parent multiple times to reduce I/O required via IPC, particularly if the same block is executed multiple times. If you are fuzzing self-modifying code, then your day is probably already going badly.
The following is provided as an example workflow for initializing a fork server based fuzzer:
p -> setup IPC mechanism with child (e.g. pipe) p -> create custom Transformer to send address of instrumented block to parent via IPC. Transformer should be inert until latched. Callouts should still be generated as required when not latched, but should themselves be inert until latched. p -> gum_stalker_follow_me () p -> gum_stalker_deactivate ()
BEGIN LOOP:
p -> fork () p -> waitpid ()
c -> set latch to trigger Transformer (note that this affects only the
child process).
c -> gum_stalker_activate (activation_target)
c -> activation_target ()
c ->
p -> gum_stalker_set_trust_threshold (0) p -> gum_stalker_prefetch (x) (n times for each) p -> gum_stalker_set_trust_threshold (n)
END LOOP:.