pw_fuzzer: Adding Fuzzers Using LibFuzzer#

Pigweed AI summary: The document provides a guide on how to add fuzzers using LibFuzzer in the pw_fuzzer module. It explains the steps to set up LibFuzzer for a project, write a fuzz target function, add the fuzzer to the build, build the fuzzer, and run the fuzzer locally. It also mentions additional steps that can be taken to continuously run the fuzzer and improve its code coverage.

Note

libFuzzer is currently only supported on Linux and MacOS using clang.

Step 0: Set up libFuzzer for your project#

Pigweed AI summary: This section provides instructions on how to set up libFuzzer for a project. It explains that libFuzzer is a LLVM compiler runtime and should be included with the clang installation. The section then provides instructions for setting up libFuzzer for GN and Bazel, but notes that LibFuzzer-style fuzzers are not currently supported by Pigweed when using CMake. The instructions for Bazel include including rules_fuzzing and its Abseil C++ dependency in the WORKSPACE file and defining

Note

This workflow only needs to be done once for a project.

libFuzzer is a LLVM compiler runtime and should included with your clang installation. In order to use it, you only need to define a suitable toolchain.

Use pw_toolchain_host_clang, or derive a new toolchain from it. For example:

import("$dir_pw_toolchain/host/target_toolchains.gni")

my_toolchains = {
  ...
  clang_fuzz = {
    name = "my_clang_fuzz"
    forward_variables_from(pw_toolchain_host.clang_fuzz, "*", ["name"])
  }
  ...
}

LibFuzzer-style fuzzers are not currently supported by Pigweed when using CMake.

Include rules_fuzzing and its Abseil C++ dependency in your WORKSPACE file. For example:

# Required by: rules_fuzzing.
http_archive(
    name = "com_google_absl",
    sha256 = "3ea49a7d97421b88a8c48a0de16c16048e17725c7ec0f1d3ea2683a2a75adc21",
    strip_prefix = "abseil-cpp-20230125.0",
    urls = ["https://github.com/abseil/abseil-cpp/archive/refs/tags/20230125.0.tar.gz"],
)

# Set up rules for fuzz testing.
http_archive(
    name = "rules_fuzzing",
    sha256 = "d9002dd3cd6437017f08593124fdd1b13b3473c7b929ceb0e60d317cb9346118",
    strip_prefix = "rules_fuzzing-0.3.2",
    urls = ["https://github.com/bazelbuild/rules_fuzzing/archive/v0.3.2.zip"],
)

load("@rules_fuzzing//fuzzing:repositories.bzl", "rules_fuzzing_dependencies")

rules_fuzzing_dependencies()

load("@rules_fuzzing//fuzzing:init.bzl", "rules_fuzzing_init")

rules_fuzzing_init()

Then, define the following build configuration in your .bazelrc file:

build:asan-libfuzzer \
    --@rules_fuzzing//fuzzing:cc_engine=@rules_fuzzing//fuzzing/engines:libfuzzer
build:asan-libfuzzer \
    --@rules_fuzzing//fuzzing:cc_engine_instrumentation=libfuzzer
build:asan-libfuzzer --@rules_fuzzing//fuzzing:cc_engine_sanitizer=asan

Step 1: Write a fuzz target function#

Pigweed AI summary: This article explains how to write a fuzz target function for a fuzzer, following the guidelines given by libFuzzer. The article provides tips for writing the function, such as returning early if the input doesn't meet certain constraints, bootstrapping coverage by crafting examples and adding them to a corpus, and using tools to split a fuzzing input into multiple fields if needed. The article also suggests using structure aware fuzzing for transformed inputs, doing startup initialization if necessary, and disabling non-deterministic

To write a fuzzer, a developer needs to write a fuzz target function following the guidelines given by libFuzzer:

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  DoSomethingInterestingWithMyAPI(data, size);
  return 0;  // Non-zero return values are reserved for future use.
}

When writing your fuzz target function, you may want to consider:

  • It is acceptable to return early if the input doesn’t meet some constraints, e.g. it is too short.

  • If your fuzzer accepts data with a well-defined format, you can bootstrap coverage by crafting examples and adding them to a corpus.

  • There are tools to split a fuzzing input into multiple fields if needed; the FuzzedDataProvider is particularly easy to use.

  • If your code acts on “transformed” inputs, such as encoded or compressed inputs, you may want to try structure aware fuzzing.

  • You can do startup initialization if you need to.

  • If your code is non-deterministic or uses checksums, you may want to disable those only when fuzzing by using LLVM’s FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION

Step 2: Add the fuzzer to your build#

Pigweed AI summary: This section provides instructions on how to add a fuzzer to a build using GN, CMake, or Bazel. For GN, the user needs to add the GN target to the module using the pw_fuzzer GN template and add the fuzzer GN target to the module's group of fuzzers. For CMake, LibFuzzer-style fuzzers are not currently supported by Pigweed. For Bazel, the user needs to add a Bazel target to the module using the pw_cc_f

To build a fuzzer, do the following:

Add the GN target to the module using pw_fuzzer GN template. If you wish to limit when the generated unit test is run, you can set enable_test_if in the same manner as enable_if for pw_test:

# In $dir_my_module/BUILD.gn
import("$dir_pw_fuzzer/fuzzer.gni")

pw_fuzzer("my_fuzzer") {
  sources = [ "my_fuzzer.cc" ]
  deps = [ ":my_lib" ]
  enable_test_if = device_has_1m_flash
}

Add the fuzzer GN target to the module’s group of fuzzers. Create this group if it does not exist.

# In $dir_my_module/BUILD.gn
group("fuzzers") {
  deps = [
    ...
    ":my_fuzzer",
  ]
}

Make sure this group is referenced from a top-level fuzzers target in your project, with the appropriate fuzzing toolchain. For example:

# In //BUILD.gn
group("fuzzers") {
  deps = [
    ...
    "$dir_my_module:fuzzers(//my_toolchains:host_clang_fuzz)",
  ]
}

LibFuzzer-style fuzzers are not currently supported by Pigweed when using CMake.

Add a Bazel target to the module using the pw_cc_fuzz_test rule. For example:

# In $dir_my_module/BUILD.bazel
pw_cc_fuzz_test(
    name = "my_fuzzer",
    srcs = ["my_fuzzer.cc"],
    deps = [":my_lib"]
)

Step 3: Add the fuzzer unit test to your build#

Pigweed AI summary: Pigweed can automatically generate unit tests for libFuzzer-based fuzzers in some build systems. The generated unit test needs to be added to the module's test group and verifies that the fuzzer can build and run, even when not being built in a fuzzing toolchain. However, LibFuzzer-style fuzzers are not currently supported by Pigweed when using CMake, and fuzzer unit tests are not generated for Pigweed's Bazel build.

Pigweed automatically generates unit tests for libFuzzer-based fuzzers in some build systems.

The generated unit test will be suffixed by _test and needs to be added to the module’s test group. This test verifies the fuzzer can build and run, even when not being built in a fuzzing toolchain. For example, for a fuzzer called my_fuzzer, add the following:

# In $dir_my_module/BUILD.gn
pw_test_group("tests") {
  tests = [
    ...
    ":my_fuzzer_test",
  ]
}

LibFuzzer-style fuzzers are not currently supported by Pigweed when using CMake.

Fuzzer unit tests are not generated for Pigweed’s Bazel build.

Step 4: Build the fuzzer#

Pigweed AI summary: This section provides instructions on how to build a fuzzer using LibFuzzer-style fuzzers. The compiler needs to add instrumentation and runtimes when building. The section provides instructions for building fuzzers using GN and Bazel, but notes that CMake does not currently support LibFuzzer-style fuzzers. The AddressSanitizer toolchain can be specified when building fuzzers using Bazel.

LibFuzzer-style fuzzers require the compiler to add instrumentation and runtimes when building.

Select a sanitizer runtime. See LLVM for valid options.

$ gn gen out --args='pw_toolchain_SANITIZERS=["address"]'

Some toolchains may set a default for fuzzers if none is specified. For example, //targets/host:host_clang_fuzz defaults to “address”.

Build the fuzzers using ninja directly.

$ ninja -C out fuzzers

LibFuzzer-style fuzzers are not currently supported by Pigweed when using CMake.

Specify the AddressSanitizer fuzzing toolchain via a --config when building fuzzers.

$ bazel build //my_module:my_fuzzer --config=asan-libfuzzer

Step 5: Running the fuzzer locally#

Pigweed AI summary: This section provides instructions for running the fuzzer locally using GN or Bazel. It also includes information on passing additional options and arguments, as well as environment variables for sanitizer flags. The output of running the fuzzer is also shown as an example. The section ends with suggestions for next steps, such as continuous fuzzing and improving code coverage. Various references are provided for further information on topics such as libFuzzer options and sanitizer flags.

The fuzzer binary will be in a subdirectory related to the toolchain. Additional libFuzzer options and corpus arguments can be passed on the command line. For example:

$ out/host_clang_fuzz/obj/my_module/bin/my_fuzzer -seed=1 path/to/corpus

Additional sanitizer flags may be passed uisng environment variables.

LibFuzzer-style fuzzers are not currently supported by Pigweed when using CMake.

Specify the AddressSanitizer fuzzing toolchain via a --config when building and running fuzzers. Additional libFuzzer options and corpus arguments can be passed on the command line. For example:

$ bazel run //my_module:my_fuzzer --config=asan-libfuzzer -- \
  -seed=1 path/to/corpus

Running the fuzzer should produce output similar to the following:

INFO: Seed: 305325345
INFO: Loaded 1 modules   (46 inline 8-bit counters): 46 [0x38dfc0, 0x38dfee),
INFO: Loaded 1 PC tables (46 PCs): 46 [0x23aaf0,0x23add0),
INFO:        0 files found in corpus
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2      INITED cov: 2 ft: 3 corp: 1/1b exec/s: 0 rss: 27Mb
#4      NEW    cov: 3 ft: 4 corp: 2/3b lim: 4 exec/s: 0 rss: 27Mb L: 2/2 MS: 2 ShuffleBytes-InsertByte-
#11     NEW    cov: 7 ft: 8 corp: 3/7b lim: 4 exec/s: 0 rss: 27Mb L: 4/4 MS: 2 EraseBytes-CrossOver-
#27     REDUCE cov: 7 ft: 8 corp: 3/6b lim: 4 exec/s: 0 rss: 27Mb L: 3/3 MS: 1 EraseBytes-
#29     REDUCE cov: 7 ft: 8 corp: 3/5b lim: 4 exec/s: 0 rss: 27Mb L: 2/2 MS: 2 ChangeBit-EraseBytes-
#445    REDUCE cov: 9 ft: 10 corp: 4/13b lim: 8 exec/s: 0 rss: 27Mb L: 8/8 MS: 1 InsertRepeatedBytes-
...