2020S342295: Assignment 3 (June 7)

JavaScript is disabled in your browser.
Many features of Moodle will be not usable or will appear to be broken.
Please enable JavaScript for the full Moodle experience.

Due: Sunday, 7 June 2020, 11:59 PM

README (the actual EXERCISE is below)

Parallel failed literal preprocessor for SAT formulas in DIMACS format.

This program takes a CNF in DIMACS format and simplifies it with failed
literal probing until completion.  This is an old well-known technique
from look-ahead solvers and works as follows.

Consider this formula in DIMACS format with 2 variables and 3 clauses:

  c this is a comment line
  p cnf 2 3
  1 2 0
  -1 2 0
  1 -2 0

The header 'p cnf 2 3' gives the number of variables ('2') and clauses ('3').
The rest of the file consists of zero terminated clauses.  Note literals are
encoded as integers.  As formula you could think of this CNF as follows:

  (x1 | x2) & (!x1 | x2) & (x1 | !x2)

On this formula failed literal probing works as follows:

We assume 'x2=true' then the third clauses forces 'x1=true' and the remaining
clauses are satisfied.  Thus we did not learn anything yet and reset both
variable to be unassigned.  The same happens if we assume 'x1=true'.
No assume 'x2=false'.  Then the first clause forces 'x1=true' and we
get that the second clause and thus the whole formula is 'false'.  There fore
'x2' can not be 'false' in any solution to this formula.  It has to be 'true'.
We permanently assign 'x2' to 'true' which in turn also forces 'x1' to be
permanently set to 'true'.  Now all variables are forced to a value and the
formula is satisfiable.  Thus this is only satisfying assignment.

The result of failed literal probing returns the forced assignment as clauses
and simplifies the remaining clauses (none in the example) accordingly.

  p cnf 2 2
  1 0
  2 0

The complete algorithm is not that complicated:

For all literals L in the formula, assume L, i.e., set L to 'true' and then
propagate it.  Propagation means to deduce all implied literals. More
specifically if there is a clause in the formula which has all but one of its
literals set to 'false', then the remaining one has to be set to 'true'.

If propagation leads to a conflict (all literals of a clause are assigned to
'false'), then the assumption that the formula can be satisfiable while L is
'true' is wrong.  Then  L is called a 'failed literal' and we can permanently
set L to 'false'.  If on the other hand propagation of the assumption that L
is 'true' does not lead to a conflict then we have to backtrack and unassign
the assumption and all literals implied by it.

If a failed literal is found and set to 'false', we need to propagate this
learned fact (the unit clause which consist of the negation of 'L'). This in
turn might yield additional implied literals (and even lead to a conflict in
which case the formula is proven to be unsatisfiable).  In any case, after a
failed literal has been found, the formula becomes simpler and in principle we
need to probe all literals again.

This procedure has cubic complexity since (A) we have to repeat until no more
failed literals are found, (B) try all literals and (C) propagate each.

There is one optimization implemented in 'parfail' which avoids redundant
probing of literals (assuming and propagating a literal).  If a literal
L was assigned to 'true' and propagated and that propagation did not lead
to a conflict, then this literal does not have to be assumed to be 'true'
before another failed literal is found.  This prevents one of the linear
factors in the complexity analysis above and renders the procedure quadratic.
In order to implement this optimization we use virtual time stamps of when a
literal was propagated without conflict, measured in the number of failed
literals found so far before this propagation.

Even though the linear factor (C) is in many formulas much smaller than
the worst cases (almost all variables become assigned), there are instances
where this algorithm even with the described optimization takes too much
time and therefore it is a prefect target for parallelization.


EXERCISE

Parallelize the sequential version using Pthreads and other methods discussed
in the last exercise.  The sequential version already gives some hints.

The actual algorithm uses only non-static data stored in a 'Worker' structure,
such as the current assignment and the trail of assigned literals for
backtracking.  There is a global result part, which contains a stack of
learned units (negated failed literals). I would suggest to use this stack to
communicate failed literals between different work threads and otherwise let
the worker threads own and probe different subsets of literals.  You have to
be careful not to add learned units to the global learned unit stack which
were already found by other threads.  A strategy could be to always import
learned units before trying to push.  The formula is shared and propagation
only changes the state of a worker.  After all workers do not find new failed
literals anymore, they should have all the same permanent assigned variables
(a super set of the learned units).  Thus all but one can be reset.  The
remaining one will be used to simplify the formula formula and export all the
'implied' literals for printing.  Note, that failed literal probing is a
confluent preprocessing technique Thus the resulting simplified formula should
always be the same.

With 'make test' you can run some simple examples.  The provided big
compressed example CNFs can be use for testing how your parallel solution
scales.  Here is the output I got of the optimized sequential version.  There
are many other interesting examples try from the last SAT competition.

$ ./parfail examples/bin1.cnf -
c [parfail] Parfail Failed Literal Preprocessor
c [parfail]
c [parfail] reading DIMACS file from 'examples/bin1.cnf'
c [parfail] found 'p cnf 2 3' header
c [parfail] found 0 original unit clauses
c [parfail] using 12 threads as determined by 'sysconf'
c [parfail] removed 3 clauses (0 remain 0%)
c [parfail] formula SATISFIABLE (all variables assigned)
c [parfail]
c [parfail] writing simplified formula to '<stdout>'
p cnf 2 2
1 0
2 0
c [parfail]
c [parfail] propagations 2 (0.01 millions per second)
c [parfail] decisions 1 (2.0 propagations per decisions)
c [parfail] conflicts 1 (1.0 decisions per conflict)
c [parfail] learned 1 (100% learned unit clauses per conflict)
c [parfail] implied 2 (2.0 per learned unit clause)
c [parfail]
c [parfail] process time 0.000 seconds
c [parfail] wall clock time 0.000 seconds
c [parfail] utilization 43% for 12 threads


$ ./parfail examples/slp-synthesis-aes-bottom18-sc2011.cnf.xz
c [parfail] Parfail Failed Literal Preprocessor
c [parfail]
c [parfail] reading DIMACS file from 'examples/slp-synthesis-aes-bottom18-sc2011.cnf.xz'
c [parfail] found 'p cnf 36410 121578' header
c [parfail] found 1 original unit clauses
c [parfail] using 12 threads as determined by 'sysconf'
c [parfail] removed 21789 clauses (99789 remain 82%)
c [parfail] formula not solved
c [parfail]
c [parfail] not writing simplified CNF (use '-' to write to '<stdout>')
c [parfail]
c [parfail] propagations 176059990 (11.38 millions per second)
c [parfail] decisions 1816700 (96.9 propagations per decisions)
c [parfail] conflicts 2026 (0.0 decisions per conflict)
c [parfail] learned 2026 (100% learned unit clauses per conflict)
c [parfail] implied 5096 (2.5 per learned unit clause)
c [parfail]
c [parfail] process time 15.465 seconds
c [parfail] wall clock time 15.465 seconds
c [parfail] utilization 8% for 12 threads


$ ./parfail examples/Mario-t-hard-2_c18.cnf.xz 
c [parfail] Parfail Failed Literal Preprocessor
c [parfail]
c [parfail] reading DIMACS file from 'examples/Mario-t-hard-2_c18.cnf.xz'
c [parfail] found 'p cnf 35945 1228400' header
c [parfail] found 27 original unit clauses
c [parfail] using 12 threads as determined by 'sysconf'
c [parfail] removed 832 clauses (1227568 remain 100%)
c [parfail] formula not solved
c [parfail]
c [parfail] not writing simplified CNF (use '-' to write to '<stdout>')
c [parfail]
c [parfail] propagations 14505730 (0.27 millions per second)
c [parfail] decisions 222217 (65.3 propagations per decisions)
c [parfail] conflicts 34 (0.0 decisions per conflict)
c [parfail] learned 34 (100% learned unit clauses per conflict)
c [parfail] implied 98 (2.9 per learned unit clause)
c [parfail]
c [parfail] process time 54.388 seconds
c [parfail] wall clock time 54.388 seconds
c [parfail] utilization 8% for 12 threads


Note that the thread part is only there for your convenience.  It does
not do anything yet, except that corresponding command line options
can be parsed (the number of threads can be set):

$ ./parfail -h
usage: parfail [-h] [<threads>] [<input> [<output>]]

assignment3.zip
14 May 2020, 6:22 PM