Skip navigation

Category Archives: StarExec

Our shipment from HP of 160 nodes for StarExec arrived this week.  Pictures below courtesy of Aran Cox:


Everything needs to be installed and configured, so it will be some time yet (4-6 weeks) until this is incorporated into StarExec.  At that point, we will finally throw open the doors to the public…


Last Monday (June 10th), Geoff Sutcliffe, Cesare Tinelli, and I conducted a workshop on our StarExec project.  This is a large NSF-funded project to build a web service to support the infrastructure needs of logic-solving communities.  There are a lot of such communities out there: SAT, SMT, TPTP, QBF, Termination, Confluence (CoCo), ASP, HMC, Nonclassical, and quite a few more.  These different communities have developed around particular fragments of logic, with particular algorithmic problems to solve.  Because Computational Logic is rather a dynamic field these days, largely due to applications in software analysis and verification, there are new subcommunities springing up pretty frequently.  To get your community going, the current best practice says you should:

  1. agree on a format for problems (“benchmarks”)
  2. start to build a benchmark library which can be used for comparing solvers
  3. run some kind of competition or, less aggressively, an “evaluation”
  4. establish some kind of meeting — maybe a workshop at a bigger conference — to provide a locus for (1)-(3).
  5. possibly set up a solver execution service like SystemOnTpTp or termcomp (or the now retired SMT-EXEC), to let users run their own workloads.

There is interesting community-specific work to be done for all these, of course.  But some parts of this process are rather generic, and could in principle be done once and for all, for all communities.  That is the goal of StarExec.  We have a small compute cluster set up right now, with 32 compute nodes, and a web interface (and a browser-free command interface ) to interact with it.  We have substantial funds, to buy probably around 150 more compute nodes.  This month we have our first public events: SMT-EVAL 2013, which is an evaluation of SMT solvers; and CoCo 2013, which is a competition among confluence-checking tools for term-rewriting systems.

We are gradually making the system more available to the public.  There are still some significant features to be added this summer, and there are bugs we have not tracked down yet.  But if you are interested in checking out the system, you can log in as a guest here (though you cannot run jobs that way), or email me.  By the end of the summer we will hopefully be ready to advertise more broadly and let more people on the system — although with only 32 nodes, it is not going to be too exciting.  We are hoping to have the new hardware purchased and installed by the end of the calendar year.

The purpose of this post is to summarize a bit of the discussion at the workshop, for the benefit of those interested in StarExec (including our advisory board).  We got a lot great feedback and discussion from the invited participants of the StarExec 2013 workshop (listed on the workshop web page).  I am just going to summarize quickly some points I noted from the participants:

  • Daniel Le Berre (SAT): perhaps benchmark preprocessors (which can be invoked whenever a benchmark is uploaded) should be allowed to rewrite the benchmark, for example to normalize it.  This could also be useful for lightly scrambling benchmarks as is sometimes done, but then one would probably want the ability to preprocess benchmarks again, after they have been uploaded.  This would be a useful feature anyway, in my opinion.  Geoff pointed out this would be good for postprocessing job results, too (a community-wide script can be run on the solver output, to generate attributes in the database for that job pair).
  • Harald Zankl (Termination, Confluence): the experience of Termination (of rewriting systems) was that buying exotic hardware — e.g., with large numbers of cores or huge amounts of memory — is bad, because solver implementors are incentivized to tune to that hardware.  Harald said that in a couple years, no one could wrong a termination checker on ordinary hardware like a laptop.  This seems like a strong argument against buying a small amount of exotic hardware.
  •  Stefan Schulz (TPTP): UPS units are more trouble than they are worth, because they fail often.  I find this somewhat controversial, because we have so far had pretty good luck, and it was my understanding that including such units in racks for clusters was a best practice (in environments where power is not reliable).
  • Thomas Krennwallner (ASP, FLoC Olympic Games): we need to use numactl to assign the “memory node” affinity of processes, not just the core affinity.  Apparently, cores can access memory banks associated with other cores, and if misconfigured, this can impact performance.  This was news to almost all of us there, I think, and something we have to look into.  Thomas also told us about “linux control groups“, which allow one to control resource usage of groups of processes.  This is great, because we are currently using a tool called runsolver for this.  Runsolver seems to work quite well, but does not handle some resource issues, like this memory-node affinity business, and, I believe, max disk usage.  If there is an OS-level solution, that is really great, so we will explore this, too.
  • Stefan: we should reserve a node or two for running a single short test job whenever a solver is uploaded.  That way, you do not have to sit and wait your turn to run your solver on the compute nodes, only to find that there is configuration problem or platform issue and your solver won’t run.  It would be very helpful for solver uploaders to get a quick confirmation that the solver is able to run on StarExec.
  • Daniel: community leaders should be given the option to copy default spaces into a new user’s space.  This is so that as a new user, when you go to your space within a community, you immediately could have at least a sample of benchmarks and a couple solvers there, so you could start to use the system quickly.  In our current setup, a new user will need to copy some benchmarks and solvers from somewhere (or upload some) into his/her own space, in order to run a job.  This seems like a good idea, and well in the spirit of community-level parameterization that we are trying to support in the system.
  • Morgan Deters (SMT): showing some kind of activity log per user or community would help import some ideas of social networking into the site.
  • Geoff: we should support downloading XML for jobs.  Right now, you can download XML for spaces, and that XML will describe the space hierarchy, with its benchmarks and solvers.  We should be able to download and upload XML for jobs, so that you can tweak jobs by removing job pairs, or other modifications.  I am a little convinced by this one, because I have been using the system successfully (as a user) where I just create a space hierarchy (possibly by uploading space XML) to represent a workload that needs to be run.  To run it, I just a create a job in that space.  I can rerun easily by creating a new job there.  So I am not sure why we need to have a separate mechanism for using XML to create a job.
  • Stefan: we should support the ability to use all cores on the nodes, in case exact timing does not matter too much, and we just want to ram a large workload through the system.  Indeed, Nikolaj Bjorner (SMT) was telling us that on their internal cluster at Microsoft Research, they can configure jobs to run with 1 job pair per core, per socket, or per node.  This is attractive, although I am not looking forward to fitting this in to the dreadful Oracle GridEngine job management system that we are currently using (and Daniel suggested Torque — Thomas suggested Condor — as alternatives).
  • Christoph Benzmüller (HOL, Nonclassical): it would be nice if we could give at least a worst-case upper-bound estimate of how long it would take a job to make it through the cluster, given existing competing jobs.  This seems like a good idea, and certainly a current worst-case bound should be easy to compute — but it will change as new workloads come in or out of the system.  So it may not be too useful after all.
  • We also discussed the issue of querying, both benchmarks and job results.  You might want to filter rows for job pairs based on things like did all solvers error out on that benchmark (possibly indicating a problem with the benchmark itself), or did only one solver solve a particular benchmark.
  • We also had a stimulating discussion about what makes a solver community, initiated by Jens Otten (Nonclassical).  It is a largely social phenomenon, of course, and we have to set some kind of informal threshold, to prevent proliferation of 1-person communities on the system.  I think a rough rule of thumb is if there is some kind of organized meeting like a workshop that is proposing the community for StarExec, that is sufficient.

There’s more I’m sure I missed noting here, but this captures some of our discussion.  I have quite a few emails from people after the workshop, so maybe I will update the post as I reply to those.

The semester has begun with great business here at U. Iowa, and it is only with the extra time to work that comes, ironically, from Labor Day that I have time to post a little about what is happening in our group.  Probably our biggest news is that the U.S. National Science Foundation has funded our large (just under $2 million) Computing Research Infrastructure (CRI) grant for StarExec.  The goal of this project is to design and implement a cluster-backed web service called StarExec, which will serve as shared infrastructure for the many different communities in Computer Science developing logic solvers.  Such programs are designed to be used as backends by  applications in other domains, notably Artificial Intelligence (for example, planning problems), Hardware and Software Verification,  Static Program Analysis, Formal Ontology, Combinatorial Design, and more.  The idea is that one translates problems from the application domain — a verification problem, say — into a logical formula where validity of that formula (or dually, satisfiability), corresponds to solvability of the original problem.  Logic solvers have improved tremendously in the past 10-15 years, and can handle enormous problems.  SAT solvers (for pure propositional logic), for example, can handle formulas with hundreds of thousands of variables and millions of clauses.  The nominal search space for such problems dwarfs the number of atoms in the observable universe.  Of course, it is clever heuristics that have been demonstrated to work well in practice that makes searching and pruning this space possible.

With so much interest in logic solvers as backends for such domains, there has been a lot of growth in logic solving as a field, and in the availability of high-performance solvers for different particular kinds of logics.  Different communities have sprung up (over the past couple decades) around one or the other logic.  This is because different logical features give rise to significantly different engineering and theoretical issues.  There have been signs of reconvergence of these different branches of automated reasoning.  StarExec aims to help fuel such reconvergence, in the first place by providing a common service for creating and running jobs, where a selection of solvers is executed on a selection of benchmark formulas.  StarExec will support uploading of both solvers and formulas.  The former feature is useful for solver implementors, who want to compare their solvers to existing ones; the latter is useful for application developers, who want to compare solvers on their particular class of formulas.  This feature is inspired both by community demand and also by System On TPTP, a web service implemented by Geoff Sutcliffe at U. Miami for first-order automated theorem provers.  Indeed, the PIs for this StarExec grant are Geoff, Cesare Tinelli (my colleague here at U. Iowa), and myself.

The reason the system is called StarExec is that it will be a cross-community execution service.  Different logic-solving communities will be able to use the service for services like hosting their libraries of benchmark formulas, running annual competitions or evaluations, maintaining real-time leaderboards (to define the state of the art for solvers in their community), and providing information about their benchmark formats and related standards.  An advisory board consisting of community leaders from the different solver communities, as well as prominent application developers using solvers as backends, will help us ensure we provide the services that all our constituents want.

We are currently working with JJ Urich and Hugh Brown of our Computer Support Group (serving Computer Science, Math, and Statistics), both on ordering equipment, and helping interface with our campus IT Services for physical hosting of the cluster.  As we have funds to buy at least 150 nodes, physical requirements like space, power, and cooling, are significant.

We plan to be running solver competitions on StarExec — at least on a trial basis — in Summer 2012.  This is an ambitious goal, but we have great talent on the ground at the moment to develop the service.  Tyler Jensen and CJ Palmer are two of the top recent undergraduates (currently Master’s students) of our program, and have already got demoable features up and running (thanks to a planning grant for StarExec we had last year).  Unfortunately for us, but not surprisingly, they have exciting job opportunities in the works for after they graduate this December, so we will be recruiting understudies this semester, who we hope will carry the torch when they have moved on.