pw_perf_test#
Pigweed’s perf test module provides an easy way to measure performance on any test setup. By using an API similar to GoogleTest, this module aims to bring a comprehensive and intuitive testing framework to our users, much like pw_unit_test.
Warning
The PW_PERF_TEST macro is still under construction and should not be relied upon yet
Perf Test Interface#
The user experience of writing a performance test is intended to be as friction-less as possible. With the goal of being used for micro-benchmarking code, writing a performance test is as easy as:
void TestFunction(::pw::perf_test::State& state) {
// space to create any needed variables.
while (state.KeepRunning()){
// code to measure here
}
}
PW_PERF_TEST(PerformanceTestName, TestFunction);
However, it is recommended to read this guide to understand and write tests that are suited towards your platform and the type of code you are trying to benchmark.
State#
Pigweed AI summary: The state object is a component of the testing framework that is responsible for calling the timing interface and keeping track of testing iterations. It contains only one publicly accessible function, KeepRunning(), which collects timestamps to measure the code and ensures that only a certain number of iterations are run. To use the state object properly, it should be passed as an argument of the test function and the KeepRunning() function should be used as the condition in a while() loop. The code to be measured should be in the
Within the testing framework, the state object is responsible for calling the
timing interface and keeping track of testing iterations. It contains only one
publicly accessible function, since the object is intended for internal use
only. The KeepRunning()
function collects timestamps to measure the code
and ensures that only a certain number of iterations are run. To use the state
object properly, pass it as an argument of the test function and pass in the
KeepRunning()
function as the condition in a while()
loop. The
KeepRunning()
function collects timestamps to measure the code and ensures
that only a certain number of iterations are run. Therefore the code to be
measured should be in the body of the while()
loop like so:
// The State object is injected into a performance test by including it as an
// argument to the function.
void TestFunction(::pw::perf_test::State& state_obj) {
while (state_obj.KeepRunning()) {
/*
Code to be measured here
*/
}
}
Macro Interface#
The test collection and registration process is done by a macro, much like pw_unit_test.
-
PW_PERF_TEST(test_name, test_function, ...)#
Registers a performance test. Any additional arguments are passed to the test function.
-
PW_PERF_TEST_SIMPLE(test_name, test_function, ...)#
Like the original PW_PERF_TEST macro it registers a performance test. However the test function does not need to have a state object. Internally this macro runs all of the input function inside of its own state loop. Any additional arguments are passed into the function to be tested.
// Declare performance test functions.
// The first argument is the state, which is passed in by the test framework.
void TestFunction(pw::perf_test::State& state) {
// Test set up code
Items a[] = {1, 2, 3};
// Tests a KeepRunning() function, similar to Fuchsia's Perftest.
while (state.KeepRunning()) {
// Code under test, ran for multiple iterations.
DoStuffToItems(a);
}
}
void TestFunctionWithArgs(pw::perf_test::State& state, int arg1, bool arg2) {
// Test set up code
Thing object_created_outside(arg1);
while (state.KeepRunning()) {
// Code under test, ran for multiple iterations.
object_created_outside.Do(arg2);
}
}
// Tests are declared with any callable object. This is similar to Benchmark's
// BENCMARK_CAPTURE() macro.
PW_PERF_TEST(Name1, [](pw::perf_test::State& state) {
TestFunctionWithArgs(1, false);
})
PW_PERF_TEST(Name2, TestFunctionWithArgs, 1, true);
PW_PERF_TEST(Name3, TestFunctionWithArgs, 2, false);
void Sum(int a, int b) {
return a + b;
}
PW_PERF_TEST_SIMPLE(SimpleExample, Sum, 4, 2);
PW_PERF_TEST_SIMPLE(Name4, MyExistingFunction, "input");
Warning
Internally, the testing framework stores the testing function as a function pointer. Therefore the test function argument must be converible to a function pointer.
Event Handler#
The performance testing framework relies heavily on the member functions of
EventHandler to report iterations, the beginning of tests and other useful
information. The EventHandler
class is a virtual interface meant to be
overridden, in order to provide flexibility on how data gets transferred.
-
class pw::perf_test::EventHandler#
Handles events from a performance test.
-
virtual void RunAllTestsStart(const TestRunInfo &summary)#
Called before all tests are run
-
virtual void RunAllTestsEnd()#
Called after all tests are run
-
virtual void TestCaseStart(const TestCase &info)#
Called when a new performance test is started
-
virtual void TestCaseIteration(const IterationResult &result)#
Called to output the results of an iteration
-
virtual void TestCaseEnd(const TestCase &info, const Results &end_result)#
Called after a performance test ends
-
virtual void RunAllTestsStart(const TestRunInfo &summary)#
Logging Event Handler#
Pigweed AI summary: The default method for running performance tests is the Logging Event Handler, which logs test results to the console and nothing else. This method was chosen for its portability and to save time on implementing other log handlers. It is important to set a pw_log backend.
The default method of running performance tests is using the
LoggingEventHandler
. This event handler only logs the test results to the
console and nothing more. It was chosen as the default method due to its
portability and to cut down on the time it would take to implement other
printing log handlers. Make sure to set a pw_log
backend.
Timing API#
Pigweed AI summary: The Timing API provides a timing interface for performance testing needs, implementing either clock cycle record keeping or second-based recordings. For most host applications, pw_perf_test depends on pw_chrono for its timing needs, measuring performance in terms of nanoseconds. However, for embedded systems, clock cycles may give more insight into the actual performance of the system, and the Timing API provides this option by providing time measurements through a facade. This implementation directly accesses the registers of the Cortex, and therefore needs no operating
In order to provide meaningful performance timings for given functions, events, etc a timing interface must be implemented from scratch to be able to provide for the testing needs. The timing API meets these needs by implementing either clock cycle record keeping or second based recordings.
Time-Based Measurement#
Pigweed AI summary: The article discusses the use of pw_perf_test in host applications and its dependency on pw_chrono for timing measurements. Currently, pw_chrono only measures performance in nanoseconds and readers are directed to the module documentation for more information on how it works.
For most host applications, pw_perf_test depends on pw_chrono for its timing needs. At the moment, the interface will only measure performance in terms of nanoseconds. To see more information about how pw_chrono works, see the module documentation.
Cycle Count Measurement#
Pigweed AI summary: The timing API provides an option to measure perf tests in clock cycles for ARM Cortex devices, which can give more insight into the actual performance of an embedded system. This is achieved by enabling the DWT register through the DEMCR register, which provides cycle counts directly from the CPU. However, this method is vulnerable to rollover upon a duration of a test exceeding 2^32 clock cycles, limiting the duration to 43 seconds per iteration at 100 MHz. It's important to note that this
In the case of running tests on an embedded system, clock cycles may give more insight into the actual performance of the system. The timing API gives you this option by providing time measurements through a facade. In this case, by setting the ccynt timer as the backend, perf tests can be measured in clock cycles for ARM Cortex devices.
This implementation directly accesses the registers of the Cortex, and therefore needs no operating system to function. This is achieved by enabling the DWT register through the DEMCR register. While this provides cycle counts directly from the CPU, notably it is vulnerable to rollover upon a duration of a test exceeding 2^32 clock cycles. This works out to a 43 second duration limit per iteration at 100 mhz.
Warning
The interface only measures raw clock cycles and does not take into account other possible sources of pollution such as LSUs, Sleeps and other registers. Read more on the DWT methods of counting instructions.
Build System Integration#
Pigweed AI summary: The pw_perf_test tool provides build integration with Bazel and GN, allowing performance tests to be built in CMake as regular executables. Each test must configure an EventHandler by choosing an associated main() function and configure a timing interface. GN requires setting the pw_perf_test_TIMER_INTERFACE_BACKEND variable to the necessary implementation for timings and the pw_perf_test_MAIN_FUNCTION variable to the preferred event handler. Bazel requires setting the pw_perf_test_timer_backend variable to use the preferred method of timekeeping. Grouping
As of this moment, pw_perf_test provides build integration with Bazel and GN. Performance tests can be built in CMake, but must be built as regular executables.
While each build system has their own names for their variables, each test must
configure an EventHandler
by choosing an associated main()
function, and
they must configure a timing interface
. At the moment, only a
tocdepth based event handler exists, timing is only supported
where pw_chrono is supported, and cycle counts are only supported
on ARM Cortex M series microcontrollers with a Data Watchpoint and Trace (DWT)
unit.
GN#
Pigweed AI summary: This document provides instructions for building and grouping performance tests in GN. To build tests in GN, set the necessary implementation for timings and preferred event handler, and register the code using the pw_perf_test template. The pw_perf_test template creates a single perf test suite with two sub-targets, and accepts GN executable arguments. To group tests, create a basic GN group and add each perf test as a dependency. To run perf tests from GN, locate the associated binaries in the out directory and run/
To get tests building in GN, set the pw_perf_test_TIMER_INTERFACE_BACKEND
variable to whichever implementation is necessary for timings. Next, set the
pw_perf_test_MAIN_FUNCTION
variable to the preferred event handler. Finally
use the pw_perf_test
template to register your code.
import("$dir_pw_perf_test/perf_test.gni")
pw_perf_test("foo_perf_test") {
sources = [ "foo_perf_test.cc" ]
}
Note
If you use pw_watch
, the template is configured to build automatically
with pw_watch
. However you will still need to add your test group to the
pw_perf_tests group in the top level BUILD.gn.
pw_perf_test template#
Pigweed AI summary: The "pw_perf_test" template defines a performance test suite with two sub-targets. The first sub-target is the test suite within a single binary, linked against the target set in the build argument "pw_unit_test_MAIN". The second sub-target is the test sources without "pw_unit_test_MAIN". All GN executable arguments are accepted and forwarded to the underlying "pw_executable". The "enable_if" argument is a boolean indicating whether the test should be built, with a default value of true
pw_perf_test
defines a single perf test suite. It creates two sub-targets.
<target_name>
: The test suite within a single binary. The test code is linked against the target set in the build argpw_unit_test_MAIN
.<target_name>.lib
: The test sources withoutpw_unit_test_MAIN
.
Arguments
All GN executable arguments are accepted and forwarded to the underlying
pw_executable
.enable_if
: Boolean indicating whether the test should be built. If false, replaces the test with an empty target. Default true.
Example
import("$dir_pw_perf_test/perf_test.gni")
pw_perf_test("large_test") {
sources = [ "large_test.cc" ]
enable_if = device_has_1m_flash
}
Grouping#
Pigweed AI summary: This section explains how to group tests without needing a special template. A basic GN group() is created and each performance test is added as a dependency. An example code snippet is provided to illustrate this process.
For grouping tests, no special template is required. Simply create a basic GN
group()
and add each perf test as a dependency.
Example
import("$dir_pw_perf_test/perf_test.gni")
pw_perf_test("foo_test") {
sources = [ "foo.cc" ]
}
pw_perf_test("bar_test") {
sources = [ "bar.cc" ]
}
group("my_perf_tests_collection") {
deps = [
":foo_test",
":bar_test",
]
}
Running#
Pigweed AI summary: To run performance tests from gn, one must manually locate the associated binaries from the "out" directory and run or flash them.
To run perf tests from gn, locate the associated binaries from the out
directory and run/flash them manually.
Bazel#
Pigweed AI summary: Bazel is a highly efficient build system that requires minimal setup to run tests on a host. To configure the timing interface, users can set the "pw_perf_test_timer_backend" variable to their preferred method of timekeeping, although currently only the logging event handler is supported. The "pw_ccp_perf_test()" template can be used by loading the "pw_cc_perf_test" template from "//pw_build:pigweed.bzl". Bazel tests can be run like any other program using the default
Bazel is a very efficient build system for running tests on host, needing very
minimal setup to get tests running. To configure the timing interface, set the
pw_perf_test_timer_backend
variable to use the preferred method of
timekeeping. Right now, only the logging event handler is supported for Bazel.
Template#
Pigweed AI summary: This paragraph provides instructions on how to use the pw_ccp_perf_test() template by loading the pw_cc_perf_test template from //pw_build:pigweed.bzl. It also mentions that all bazel executable arguments are accepted and forwarded to the underlying native.cc_binary. An example code block is provided to demonstrate how to use the template.
To use the pw_ccp_perf_test()
template, load the pw_cc_perf_test
template from //pw_build:pigweed.bzl
.
Arguments
All bazel executable arguments are accepted and forwarded to the underlying
native.cc_binary
.
Example
load(
"//pw_build:pigweed.bzl",
"pw_cc_test",
)
pw_cc_perf_test(
name = "foo_test",
srcs = ["foo_perf_test.cc"],
)
Running#
Pigweed AI summary: Running tests in Bazel is similar to running any other program. The default command to use is "bazel run //path/to:target".
Running tests in Bazel is like running any other program. Use the default bazel
run command: bazel run //path/to:target
.