Skip to content
tassi edited this page Apr 23, 2015 · 9 revisions

= Status quo = Systems currently running some sort of benchmarking, regression testing for Coq.

|| label || URL || access || tests what || comment || || lix-bench || http://www.lix.polytechnique.fr/coq/bench/ || lix || contribs (from git) || || || lix-newbench || http://www.lix.polytechnique.fr/coq/pylons || lix || - || 404 || || inria-ci || https://ci.inria.fr/coq/ || inria || coq (from git) || || || opam-bench || http://coq-bench.github.io/ || jobs on gihub, infrastructure ??? || opam packages || ||

= Problems = The problems we have w.r.t. regression testing, that apply to one or many

  1. 4 systems, overlap, which is the reference? Maybe 5, wasn't pyrolis intended to be the bench system?
  2. access: we need to be able to access the compilation logs
  3. access: we need to be able to fix the bench system, at least for trivial errors (like it picks the wrong branch, coq_makefile refresh)
  4. reproducibility: understand why it fails (access to logs, maybe more), replicate on more hardware
  5. too slow (wait 24h for a run)
  6. are the tests representative?
  7. only 1 big test (all contribs), all or nothing
  8. test a personal experiment without impacting the other developers
  9. single point of failure (only 1 gatekeeper)
Clone this wiki locally