Lin
is a base module for specifying sequential consistency tests.Lin_domain
exposes a functor that allows to test a library under parallel usage (with domains).Lin_thread
exposes a functor that allows to test a library under concurrent usage (with threads).Lin_effect
exposes a functor that allows to test a library under concurrent usage (with effects).qcheck-lin
?qcheck-lin
is a testing library based on QCheck
for sequential consistency.
A parallel or concurrent program is said to be sequentially consistent if "the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program."1
According to a library description, qcheck-lin
generates random programs using the functionalities of this library and runs them, records the results and checks whether the observed results are linearizable by reconciling them with a sequential execution.
qcheck-lin
offers an embedded domain specific language to easily describe signatures succinctly. It provides three types of tests:
Suppose we want to implement a small mutable set library, our main focus being to have a constant time cardinal
operation. We will be using Stdlib
.Set for the content, keeping track of the cardinality when adding and removing elements.
Of course, we will be using qcheck-lin
for testing!
Our library reads like that:
module type S = sig
type elt
type t
val empty : unit -> t
val mem : elt -> t -> bool
val add : elt -> t -> unit
val remove : elt -> t -> unit
val cardinal : t -> int
end
module Lib : sig
module Make : functor (Ord : Set.OrderedType) -> S with type elt = Ord.t
end = struct
module Make (Ord : Set.OrderedType) = struct
module S = Set.Make (Ord)
type elt = Ord.t
type t = { mutable content : S.t; mutable cardinal : int }
let empty () = { content = S.empty; cardinal = 0 }
let mem a t = S.mem a t.content
let add a t =
if not (mem a t) then begin
t.content <- S.add a t.content;
t.cardinal <- t.cardinal + 1
end
let remove a t =
if mem a t then (
t.content <- S.remove a t.content;
t.cardinal <- t.cardinal - 1)
let cardinal t = t.cardinal
end
end
In order to test it for sequential consistency, qcheck-lin
needs a lightweight specification of our library's interface. This specification takes the form of a module matching the Spec
signature. Then qcheck-lin
does all the heavy lifting for us!
This specification exposes one type and three values:
t
which is the main type of our library, here the mutable set.init
that tells qcheck-lin
how to create an initial value of type t
.cleanup
that tells qcheck-lin
how to clean up after the tests (which is necessary when t
uses resources that must be released, such as opened files, network connections, etc.).api
which is a list of the library's functions we want to include in the tests. These functions are encoded using a custom embedded domain specific language.open Lin
module LibInt = Lib.Make (Int)
module Spec : Spec = struct
type t = LibInt.t
let init = LibInt.empty
let cleanup _ = ()
let api =
let int = nat_small in
[
val_ "mem" LibInt.mem (int @-> t @-> returning bool);
val_ "add" LibInt.add (int @-> t @-> returning unit);
val_ "remove" LibInt.remove (int @-> t @-> returning unit);
val_ "cardinal" LibInt.cardinal (t @-> returning int);
]
end
Let's have a closer look at the api
value. This is where qcheck-lin
gets the information about the functions we want to include in our tests. api
is a list of values. These values should be constructed using either the val_
function or the val_freq
one.
In the example, we use val_
, but let's first describe val_freq
.
val_freq
takes four arguments:
int
that is a weight used by QCheck
for generation. It allows to skew the distribution of the function in the generated programs.string
so that it can be printed in the output.val_
is a specialization of val_freq
, giving the same weight to all the elements.
Note that the domain specific language brings some static guarantees about the type encoding. If we make a mistake, we will know at compile time.
Now we are set to run our first qcheck-lin
tests!
We will be testing our library for parallel usage. The functor Lin_domain.Make
takes the Spec
as argument and exposes two functions:
lin_test
to build a positive QCheck
test,neg_lin_test
to build a negative QCheck
test.Here, we expect the test to succeed, so we will use the first one.
module LibDomain = Lin_domain.Make (Spec)
let _ =
QCheck_base_runner.run_tests ~verbose:true
[ LibDomain.lin_test ~count:1000 ~name:"Lin parallel tests" ]
And the test fails...
$ dune exec ./mutable_set.exe
random seed: 429006728
generated error fail pass / total time test name
[✗] 1 0 1 0 / 1000 1.9s Lin parallel tests
--- Failure --------------------------------------------------------------------
Test Lin parallel tests failed (41 shrink steps):
|
|
.---------------------.
| |
add 0 t add 0 t
cardinal t
+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Messages for test Lin parallel tests:
Results incompatible with sequential execution
|
|
.------------------------------------.
| |
add 0 t : () add 0 t : ()
cardinal t : 2
================================================================================
failure (1 tests failed, 0 tests errored, ran 1 tests)
In the case of a failing test, qcheck-lin
prints the counterexample it has found, after some shrinking steps, as is customary in property-based testing à la QuickCheck. The counterexample is a program with a sequential prefix and two parallel suffixes. The counterexample is given twice: the first time with just the function calls; and the second time, each call is paired with its result.
Here, the counterexample is composed of an empty sequential prefix and two spawned processes. The first one adds 0 to the set, while the second one also adds 0 to the set and then asks for the cardinality of the said set.
In terms of sequential interleaving, there are only three possibilities. If we prefix the function call by Left
and Right
depending on the process in which they are, those sequential interleavings are:
1. Left.add 0 t; Right.add 0 t; Right.cardinal t
2. Right.add 0 t; Left.add 0 t; Right.cardinal t
3. Right.add 0 t; Right.cardinal t; Left.add 0 t
None of them can explain the value 2 returned by cardinal t
.
What happened is that both calls to add
have checked whether 0
was already an element of the set at the same time (or at least before the other one had a chance to add it). It was not. So, according to our implementation, both calls added the element and incremented the cardinal
field.
One way to make our library safe for parallel usage is to protect the set with a mutex.
module type S = sig
type elt
type t
val empty : unit -> t
val mem : elt -> t -> bool
val add : elt -> t -> unit
val remove : elt -> t -> unit
val cardinal : t -> int
end
module Lib : sig
module Make : functor (Ord : Set.OrderedType) -> S with type elt = Ord.t
end = struct
module Make (Ord : Set.OrderedType) = struct
module S = Set.Make (Ord)
type elt = Ord.t
type t = { mutable content : S.t; mutable cardinal : int; mutex : Mutex.t }
let empty () = { content = S.empty; cardinal = 0; mutex = Mutex.create () }
let mem_non_lock a t = S.mem a t.content
let mem a t =
Mutex.lock t.mutex;
let b = S.mem a t.content in
Mutex.unlock t.mutex;
b
let add a t =
Mutex.lock t.mutex;
if not (mem_non_lock a t) then being
t.content <- S.add a t.content;
t.cardinal <- t.cardinal + 1
end;
Mutex.unlock t.mutex
let remove a t =
Mutex.lock t.mutex;
if mem_non_lock a t then begin
t.content <- S.remove a t.content;
t.cardinal <- t.cardinal - 1
end;
Mutex.unlock t.mutex
let cardinal t =
Mutex.lock t.mutex;
let c = t.cardinal in
Mutex.unlock t.mutex;
c
end
end
Once this is done, we can use exactly the same specification and tests. The successful output looks like the following:
$ dune exec ./mutable_set_lock.exe
random seed: 162610433
generated error fail pass / total time test name
[✓] 1000 0 0 1000 / 1000 48.0s Lin parallel tests
================================================================================
success (ran 1 tests)
Lin
in a bit more detailUnderneath the hood Lin
uses QCheck and OCaml's pseudo-random number generator from the Random
module to generate arbitrary cmd
sequences and arbitrary input argument data to each call. To recreate a problematic test run, one therefore needs to generate the same pseudo-random test case input, by passing the same randomness seed. By running the Lin
tests using QCheck_base_runner.run_tests_main
from QCheck, it is possible to pass a seed as a command line argument as follows:
$ dune exec ./mutable_set.exe -- -s 429006728
Despite generating and thus running the same test case input, one may still experience different behaviours on subsequent reruns of the resulting test, because of CPU scheduling and other factors. This may materialize as different counterexamples being printed or as one run failing the test whereas another run passes it. Lin_domain
uses the Util.repeat
combinator to repeat each test case 50 times to address the issue and help increase reproducibility.
Lin
comes with a number of limitations which we plan to address in future releases. Currently Spec.api
descriptions:
Spec.t
- namely the one resulting from Spec.init
,Spec.t
with other type combinators such as Lin.list
and Lin.option
- this restriction is expressed with the Lin.noncombinable
type parameter in Lin.t
,Lin.int_bound
as both an argument type and as a result type - you can use Lin.int
for the latter,t1 @-> t2
,t1 * t2
.The later two can however be addressed by writing a custom argument generator using Lin.gen
.
1. Lamport, How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Program, 1979, DOI: 10.1109/TC.1979.1675439