OBSOLETE, OUTOFDATE DESCRIPTION:
(Please use the link to the pdf above!)
Polynomial Exact3SAT Solving Algorithm Louis Coder December 2012 Abstract In this document I want to introduce and explain an algorithm that determines the solvability state (solvable or unsatisfiable) of any exact3SAT formula in polynomial time. It is for sure that the algorithm has polynomial runtime, even in the worst case, as the runtime is artificially limited. The question is only if the algorithm does always output the correct result. I suppose it does, due to a proof of correctness that will be shown in this document, and the evidence that an implementation of the algorithm solved millions of test formulae without any error. Furthermore this document provides a download link to a (Windows) demo solver program that you can try out. If the algorithm should really be correct, it would solve the PNP problem by proving that P = NP [1]. 1. Introduction The PNPproblem is a still unsolved problem well known by computer scientists. The problem asks if it is possible to solve those problems in polynomial time that are currently associated to the complexity class ‘NP‘. One of those problems that lies in NP is exact3SAT. Exact3SAT is a variant of the suchcalled satisfiability problem. The task is to determine if there’s a solution for a formula given as conjunctive normal form with exactly 3 different literals per clause [2]. Example: The ANDterms are called clauses, the ORterms within a clause are called literals [3]. A solution would be e.g. a table that assigns either true or false to every literal. One possible solution for the just shown CNF would be: …because when we insert this into the CNF we get: As in every clause there’s at least once true, the CNF itself becomes true. It might be that the CNF can never become true, no matter how you choose the values of the literals. My algorithm shall determine the solvability state of any given CNF, that means whether it is possible at all to get the whole CNF true. Note that in the following, ‘algorithm’ means the polynomial algorithm this document concerns about. Furthermore the expressions ‘clause’ and ‘ANDterm’ will be used synonymously within this document. 2. Algorithm The following algorithm to solve the satisfiability problem was developed by myself. I do not know if there are already similar algorithms existing. I did not (knowingly) extend or change any existing algorithm that is described in literature. I just wrote down unsatisfiable CNFs on a sheet of paper and tried to invent rules how you can determine the CNFs‘ solvability states. If an algorithm worked on paper, I implemented it as a program and did a lot of tests. After 2 years proceeding like this, I found the following algorithm, of which I think it is correct. The references to literature have mainly been sought at a time the algorithm was already finished. 2.1 Main Idea The most important idea of the algorithm is to do the ‘clause overlaying’, as I call it. You can ‘overlay’ two source clauses to the ‘overlaid clause’ if the source clauses contain exactly one literal once negated and once not negated . This literal does not appear in the overlaid clause any more, all other literals are just copied to the overlaid clause. Example: It might be that this ‘clause overlaying’ is already known as ‘resolution’ [4]. As I am not absolutely sure that these two concepts are really equivalent, I keep on speaking about ‘overlaying’ (it was my own invention to use this expression). To get the solvability state you just overlay all possible pairs of clauses of the given CNF. An overlaid clause can participate in an overlaying operation again. One important detail is the following: you try to get overlaid clauses with as few literals as possible. So, for instance, add an overlaid clause with 3 literal to the CNF only if there was none creatable with 2 or 1 literal(s). Furthermore, do not add an overlaid clause if it is a sub set of any existing ANDterm. For instance, do not add if there’s existing in the CNF. A clause S is a sub set of a clause L if S contains at least all literals of L (negation states in S, i.e. if literals are negated or not, must match the ones in L, of course). You do the overlaying until one of the following three cases happens: The empty clause with no literals appears => the CNF is unsatisfiable, there is no solution existing You cannot create new overlaid clauses any more => the CNF is solvable, it exists some solution You exceeded a special limit of clauses (will be explained later) => the CNF is solvable 2.2 Example Usage In the following example, I demonstrate how we find out that the given CNF is unsatisfiable. To keep the example short and easy to understand, I use an exact2SAT formula, with 2 instead of 3 literals in the initial CNF’s ANDterms. The proceeding with 3 literals would be equivalent, except that it would take more ANDterms to create. Step 0: the given CNF (‘initial CNF‘) Step 1: overlay first and second clause Step 2: overlay third and fourth clause Step 3: overlay fifth and sixth clause 3. Pseudo Code 3.1 The Algorithm in Pseudo Code The following is a Basiclike pseudo code listing of the demo solver you can download. It implements the polynomial exact3SAT solving algorithm.
3.2 Explanation of the Pseudo Code The following numbers refer to the numbers in the pseudo code. 1. we jump to this code location every time we added an overlaid clause to the CNF. 2. OverlaidClauseLiteralCountMax is the maximal literal count an overlaid clause can have to be added to the working CNF. IMPORTANT: we try to add overlaid clauses that have as few literals as possible to reduce the number of clauses in the CNF (the less literals a clause has, the less possible clauses you can create out of these literals). 3. The index of the first clause that will take part in an overlaying operation. 4. The index of the second clause that will take part in an overlaying operation. As it does not matter in which order we overlay the clauses, we can start ‘behind’ the index of the first clause. Note that, in the original C++ solver code, we additionally use an array to not overlay pairs that have been overlaid once before (the array is named ‘Start[][]’). 5. We call a procedure that does the overlaying. 6. If the overlaid clause has too many literals (see point 2)) or the overlaying operation could not be done as there is not exactly one conflict existing between the two clauses to overlay (the overlaying procedure then could set the literal count to 1, for example), we continue with the next clause pair. 7. If we get the empty clause, the CNF is unsatisfiable. 8. If there’s any clause existing in the CNF that contains the current overlaid clause, that means if the overlaid clause is a sub set of any existing clause, throw away the overlaid clause (i.e. ignore it). 9. The overlaid clause is added to the CNF and may take part in another overlaying operation. 10. If the number of clauses in the CNF exceeds a special limit (ExpectedClauseCountMax was precalculated before) we can suppose the CNF is solvable. 11. Every time we added an overlaid clause to the CNF, we jump to the beginning, which mainly results in a reset of OverlaidClauseLiteralCountMax, so that we add clauses with as few literals as possible. 12. If we reach this code line, no new overlaid clause was created any more. 4. Complexity In the following paragraphs, clause S shall be called a ‚shorter clause‘ than a clause L, if S has less literals than L. A ‚longer‘ clause is one with more literals. An xliteral clause shall be one with x literals. These expressions were introduced by myself. Also we must define what ‚complexity‘ shall be. I defined the complexity as the total number of clauses (respectively ANDterms, synonymous) added to the working CNF until the solver successfully determined the solvability state. 4.1 WorstCase Complexity of the Algorithm I say, the total number of clauses ever added to the CNF will, if the CNF is unsolvable, never exceed the count of all possible clauses with 3 or less literals. ‚Possible clauses‘ means all clauses that are creatable at all (containing the mentioned number of literals). What is the count of all possible clauses with 3 or less literals? One could say, first the total number of clauses with 3 literals and highest possible literal index = IndexMax would be:
…as we have IndexMax possibilities to choose the first literal index, one less for the second one and two less for the third one. Furthermore we have 8 possible true/false combinations:
But, this calculation is not true as this would also contain viceversa (swapped) combinations, like and (as the order of literals is not important, these two ANDterms are seen as identical). So here’s the correct calculation of the count of clauses with 3 literals and highest literal index = IndexMax: For the total complexity you must add the count of clauses with 2 or 1 literal(s). Finally the exact worstcase complexity of the polynomial solver is: This means, our solver has a complexity of . The origin of these formulas are my own considerations. They are rather easy to understand and I could not falsify them in tests, so it is very probable that the formulas are correct. 4.2 Reason for WorstCase Complexity The provided solver is programmed to output ‘solvable’ as soon as the CNF clause count exceeds the sum of all possible clauses with 3 or less literals (see 4.1). This seems to be logical, as in the initial CNF there are only clauses with exactly 3 literals, and most of them are overlaid to clauses with 2 or less literals. But, there’s a problem: it does happen that there are clauses with more than 3 literals created through overlaying, sometimes even ones with 4, 5 or more literals. In my tests this happened for both satisfiable and unsatisfiable CNFs. The problem with more than 3 literals is the following: The calculated limit is the count of all clauses with 3 or less literals If we now add a clause with 4 literals, the maximal calculated clause count (see 4.1) could be exceeded, as the limit does not include the count of additional ANDterms with more than 3 literals The question is if we could count a CNF falsely as solvable as the expected clause count is reached, but we would get the empty clause if we kept on overlaying? I say, even if clauses with more than 3 literals appear in an unsatisfiable CNF, the calculated limit is not exceeded. Reason: when a clause with more than 3 literals is added, all included clauses must be missing. So if we add for example: then there must be missing:
The code line 8) in '3.1 […] Pseudo Code’ would avoid that (in the following called ‘surplus clause’) is added if there was at least one of the shorter clauses existing in the CNF. But, what if all those shorter clauses are added afterwards, i.e. first the clause with 4 literals and then all shorter ones? Then we would have one ANDterm more in the CNF than calculated, namely additional the 4literal clause. Even if this would happen, it is for sure that there are further ANDterms missing that were recognized by the limit calculated in 4.1. Because, imagine the surplus clause would have been created through the following overlaying operation: That means that the two source clauses would have been existing in the CNF and they created the surplus clause. But there must those clauses be missing that would have avoided the creation of the surplus one! So if e.g. would have existed in the initial CNF, the algorithm would have overlaid before (!) the surplus 4literal clause would have been created: So we would have gotten one of those clauses that are a superset of the surplus one. That means, as soon as one 4literal clause is added, there must at least one 3literal clause be missing. Actually there are much more 3literal clauses missing, namely all those that would create the 2literal clauses listed farer above. So we will never exceed the calculated limit. Note that I successfully determined the solvability state of millions of CNFs with the sample implementation (see 5.), this lets me suppose the just shown proof is correct. It is very probable that, if the algorithm would not be correct, there would have appeared at least one error. 5. Sample Implementation I recommend you to download and try out the sample implementation  a Windows console program that allows you to solve random CNFs with the polynomial solver and compare the results with those of an exponential, failsafe one. In the downloadable zipfile there’s also the source code, which might be compiled under Linux, too (with some little code changes). In the zipfile there’s also this document (v3.0) and an older version (v2.1) that explains how to use the sample implementation (see ‘Documentation’ directory). http://www.louiscoder.com/Polynomial_3SAT_Solver/Polynomial_3SAT_Solver.zip 6. Proofs 6.1. Overlaying leads to Empty Clause In the following paragraphs I want to prove that, if we can not (!) get the empty clause by overlaying, the CNF is solvable. The CNF is unsatisfiable if there are at least two clauses that cannot both be satisfied by a solution. For instance, and cannot both be satisfied at the same time.
Remember: overlaying is done with pairs of clauses. If these two clauses contain exactly one conflict literal (negated in the one clause and not negated in the other one) we can remove the conflict literal and write the resting literals into the overlaid clause, which has always fewer literals than those of the two source clauses (as the conflict literals have been removed). If we can not overlay two ANDterms, it is for sure that there is a solution existing that satisfies both of them. I want to prove this by just listing such cases where we cannot overlay and corresponding possible solutions. The first two lines are the two source clauses that are to be overlaid. Right after, there are possible solutions shown that satisfy both source clauses. 1.Solution: or
2. 3. Solution: or
4.
We see, if we cannot overlay at all there are only clauses left in the CNF that can all (!) be satisfied and thus the whole CNF has some solution! Note that the listed solutions do also satisfy ANDterms with more than 2 literals. The solutions for those clauses can be found similarly. If there are clauses left with 1 literal and we cannot overlay, then these ones can be satisfied, too. The only 1literal clauses that cannot be overlaid are: 5.Solution:
6.
6.2. Sub Set Check Is it valid to not add an overlaid clause if it is a sub set of any clause already existing in the CNF? Remember that we do not add e.g. if there's already existing. I say this is valid because: to make the whole CNF solvable, all ANDterms must be satisfied by a solution. That means and also must be satisfied. Imagine that is already existing in the CNF. I say, if this clause can be satisfied, there's no need to add as means the existing clause can be satisfied. If we insert this into the longer one, we get: …and this is always true. So if can not (!) be satisfied, the whole CNF is unsatisfiable as all (!) clauses must be satisfied. But if can be satisfied, is also true in any case and we don’t need to add it to the CNF. 7. Acknowledgements I thank the reader of this document for his/her interest and I would be even more happy if he/she would comment the algorithm, if possible. 8. References [1]:1) http://de.wikipedia.org/wiki/PNPProblem (all Wikipedia versions from 20080620) 2) lecture “Berechenbarkeit und formale Sprachen” by professor Wanka, winter term 2009/2010, University ErlangenNuremberg
[2]:
[3]: [4]: http://de.wikipedia.org/wiki/Resolution_%28Logik%29 (all Wikipedia versions from 20080620)
