Property-Based Testing: Power and Limits
Property-based testing (PBT), popularized by QuickCheckĀ in Haskell in the late 1990s, reverses the traditional approach to unit testing.
The Random Testing Revolution
Rather than writing specific examples with expected inputs and outputs, PBT invites us to express universal properties that the code must satisfy for any valid input. Then, we let the PBT library generate hundreds or thousands of random cases.
This approach often discovers bugs in edge cases a developer would never have thought of:
- Extreme values
- Empty strings
- Negative numbers
- Unusual combinations
The generator explores the input space far more exhaustively than our imagination. An edge case discovered in CI costs nothing; discovered in production, it costs: incident response, customer compensation, reputation, post-mortem meetings.
Shrinking: The Key to Debuggability
The true power of PBT lies in shrinking: when a failing case is discovered, the framework automatically attempts to reduce it to the smallest reproducible example.
A 500-element array that fails a sort might become a 3-element array with a precise configuration. This minimization transforms an obscure failure into a readable, debuggable test case: hours of debugging saved per bug.
Itās also a formidable specification tool: formulating a property forces thinking about the codeās fundamental invariants. āSorting a list preserves its lengthā, āencoding then decoding returns the originalā, āthe operation is commutativeā: these assertions reveal the essence of expected behavior better than a collection of disparate examples.
The Challenge of Properties
Identifying good properties is often the main challenge. For a sorting function, the classic properties are well known:
- Idempotence:
- Preservation of elements
- Ascending order of result
But for complex business logic, what properties should we express? We sometimes fall into the trap of rewriting the implementation in the test, which verifies nothing at all.
The most useful properties are often:
- Round-trips: encode/decode, serialize/deserialize
- Structural invariants: āthe tree remains balancedā
- Comparisons with an oracle: āmy optimized version produces the same result as the naive versionā
But not all functions lend themselves naturally to this style.
Generator Quality
Generator quality constitutes another significant limitation.
A uniform generator over integers will spend most of its time on āuninterestingā values and might miss the precise case where or triggers a bug. Good frameworks allow biasing generation toward edge cases, but this requires expertise and configuration effort.
Moreover, for complex data structures with internal invariants (e.g., a valid binary search tree, a connected graph), writing a generator that produces only valid instances can prove as difficult as writing the tested code itself.
Complementarity with Example-Based Tests
PBT doesnāt replace example-based tests; it complements them.
Example-based tests remain valuable for:
- Documenting expected behavior in specific business scenarios
- Serving as a safety net during refactorings
- Cases where the property would be too complex to express
The pragmatic approach is to use PBT where it excels (pure functions, data transformations, algorithms with clear invariants) and keep classical tests for business logic with fuzzy boundaries.
Together, these two approaches offer far superior coverage than either could achieve in isolation.
Learn More: Reference Libraries
- QuickCheckĀ - Haskell, the pioneer
- fast-checkĀ - TypeScript/JavaScript
- HypothesisĀ - Python
- jqwikĀ - Java
- ScalaCheckĀ - Scala
- FsCheckĀ - .NET (F#/C#)
- proptestĀ - Rust
- rapidĀ - Go
Want to dive deeper into these topics?
We help teams adopt these practices through hands-on consulting and training.
or email us at contact@evryg.com