Skip to main content

Generalized probabilistic satisfiability through integer programming

Abstract

Background

This paper studies the generalized probabilistic satisfiability (GPSAT) problem, where the probabilistic satisfiability (PSAT) problem is extended by allowing Boolean combinations of probabilistic assertions and nested probabilistic formulas.

Methods

We introduce a normal form for this problem and show that both nesting of probabilities and multi-agent probabilities do not increase the expressivity of GPSAT. An algorithm to solve GPSAT instances in the normal form via mixed integer linear programming is proposed.

Results

The implementation of the algorithm is used to explore the complexity profile of GPSAT, and it shows evidence of phase-transition phenomena.

Conclusions

Even though GPSAT is considerably more expressive than PSAT, it can be handled using integer linear programming techniques.

Background

Propositional logic and probability theory stand as major knowledge representation tools in many fields, and notably in artificial intelligence. Useful combinations of propositional logic and probability theory are already pursued by Boole [3, Chapter XVIII], who was concerned with problems where propositional formulas are associated with probability assertions. Loosely speaking, we have propositional sentences \(\{\phi _{i}\}_{i=1}^{q}\), each containing a subset of atomic propositions \(\{A_{j}\}_{j=1}^{n}\). We may associate one or more of these sentences with probabilities, writing for instance P(ϕ i )=α i . To establish semantics for these assessments, we consider a probability measure over the set of truth assignments. The probabilistic satisfiability (PSAT) problem is to determine whether it is possible to find a probability measure over truth assignments such that all assessments are satisfied [13]. PSAT problems have received attention in a variety of fields [4, 9, 14, 15, 17]; in artificial intelligence research, PSAT problems appear as a foundation for probabilistic rules [21] and first-order probabilistic logic [16, 20, 22].

In this paper, we consider an extended version of PSAT problems. To illustrate the PSAT problem, consider the following situation:

Problem 1.

Three friends, Alice, Bob, and Charlie, go to the pub everyday and they have two options: “Bar Phi” and “Bar Not Phi”. Each friend goes exactly to one bar per night. Talking about tonight, a fourth friend, David, says that Alice goes to “Bar Phi” with probability 6/7, that Bob goes to “Bar Phi” with probability 5/7; and that the probability of Charlie drinking at “Bar Phi” is 4/7. Furthermore, David states that exactly two of them will be at “Bar Phi” tonight.

Question: Is David being consistent?

This problem can be captured by the PSAT formalism, as it will be shown later. The idea is to check the consistency of the probabilities assigned to each friend going to “Bar Phi” tonight together with the fact that exactly two of them being there with probability one. However flexible it may be, PSAT problem can only handle conjunction of probability assessments. To understand our goal, note that the satisfiability of probabilistic formulas P(ϕ i )=α i , 1≤iq obtains when there is a probability measure (over the truth assignments) that satisfies

$$P\left(\phi_{1}\right) = \alpha_{1} \wedge P\left(\phi_{2}\right) = \alpha_{2} \wedge \dots \wedge P\left(\phi_{q}\right) = \alpha_{q} \,. $$

One obvious generalization is to deal with disjunction and negation of probabilistic assertions to construct more complex formulas, such as

$$\lnot \left(P\left(\phi_{1}\right) = \alpha_{1}\right) \land \left(P(\phi_{2}) = \alpha_{2}\right) \lor \left(P\left(\phi_{3}\right) = \alpha_{3}\right). $$

We will see that the same semantics used for PSAT can be adopted in such an extended probabilistic satisfiability problem. As an example of the higher expressivity introduced by this generalization, consider a modification of Problem 1:

Problem 2.

Three friends, Alice, Bob, and Charlie, go to the pub everyday and they have two options: “Bar Phi” and “Bar Not Phi”. Each friend goes exactly to one bar per night. Talking about tonight, a fourth friend, David, says that Alice goes to her favorite bar with probability 6/7, that Bob goes to the bar he likes most with probability 5/7; and that the probability of Charlie drinking at his favorite place is 4/7. Furthermore, David states that exactly two of them will be at “Bar Phi” tonight. Question: Is David being consistent?

The key difference between Problems 1 and 2 is that in the former we have probabilities for each friend to go to “Bar Phi” and we are asked about the consistency of such assignments together with the constraint that exactly two of them are going there tonight; in the latter, the probabilities are assigned to either “Bar Phi” or “Bar Not Phi”, and we want to check whether David is stating consistent probabilities with which Alice, Bob, and Charlie are going to their favorite bars. Such difference might be clearer in the next sections, where these problems are formalized.

We can move to an even more expressive language by allowing probabilistic formulas to be nested; that is, by allowing a subformula of ϕ in P(ϕ)=α to be P(φ)=α . Nested probabilities may receive different interpretations, and ours can be exemplified with a problem:

Problem 3.

Alice says that the probability of her going to “Bar Phi” tonight with probability 5/6 is 1/2; and, with probability 1/2, she is going to “Bar Phi” tonight with probability 1/6. Question: Can these probabilistic statements be consistent?

A possible scenario consistent with what Alice says could be the following: Alice initially wanted to go to the bar, but as she likes some randomness in her life, she decided to throw a fair die tonight. She will go drink unless she gets a 6. In the middle of the day, she is not in the mood. She makes her mind to go to the bar only if the die shows up a 6. But she suddenly realizes she was too conservative, and gives to a fair coin the responsibility of choosing her destiny: in case of heads, she is going out only with a 6 in the die; with the coin landing tails, she is going to the bar unless a 6 happens to be the case. Just for fun, she tossed the coin in the afternoon, but will only see the outcome in the evening—right before throwing the die. In the meantime, she is thinking about the probability of her going out tonight. With probability 1/2, she thinks she is in a world where the coin landed heads, and the probability of her getting a 6 to drink is 1/6. But with probability 1/2, the actual case is with the coin showing tails, and she is going to the bar with probability 5/6—only a 6 can spoil her night.

In such a scenario, there is a probability measure over possible worlds (tails and heads) which have their own probability measure over other possible worlds (going to the bar or not). This is the case in the logic devised in ref. [10], which finds its application in stochastic systems, for instance. In this paper, however, we follow a different path, keeping only a single probability measure over all possible worlds. To give meaning to nested probabilities, we use the lifting assumption, that “one believes the probability of ϕ is α if, and only if, one believes that the probability of the probability of ϕ being α is 1” [23]. In Problem 3, one can argue that, if the probability of Alice going out is 5/6 or 1/6, with probability 1/2 each, then the probability of her going out is actually 1/2×5/6+1/2×1/6=1/2—with probability one. As Alice states that a probability of a probability being 5/6 is not one or zero, but 1/2, she is being inconsistent under the lifting assumption. Clearly, this assumption is restrictive, in the sense that it claims an agent may not have uncertainty about his own probabilistic beliefs. However, maybe surprisingly, under this premise, we can prove that satisfiability of formulas with nested probabilities can be reduced to satisfiability without nesting.

The problem of deciding satisfiability of these formulas is what we define as generalized probabilistic satisfiability (GPSAT). The resulting language can be viewed as the closure of probabilistic formulas with respect to Boolean and probabilistic operators.

There are no algorithms currently available to determine satisfiability of GPSAT problems, as far as we know. In this paper, we present the first such algorithm. The most direct way to solve a PSAT problem is through its linear programming formulation [14], using column generation methods to handle the exponential number of columns [19]. A recent alternative approach reduces PSAT into logical satisfiability [11]. Neither of these approaches is easily extended to deal with disjunction of constraints, that is essential to solve GPSAT.

In this paper, we present an approach to generalized probabilistic satisfiability, where the original problem is written as a mixed integer linear program of a size that is polynomial on the size of the original problem. This technique was firstly proposed to solve probabilistic satisfiability in ref. [6], and this paper builds on that work to deal with GPSAT problems.

The remaining of this section summarizes necessary fundamentals on SAT and PSAT. The generalized probabilistic satisfiability is defined in the section “The problem”, where a normal form is introduced. In the same section, we show how a multi-agent scenario is reduced to the single-agent case. An algorithm for GPSAT is described in the section “Methods”, building on a reduction from PSAT to mixed integer linear programming. Implementation and experiments, with a discussion of phase transitions, are presented in the section “Results and discussion”.

Consider the set \(\mathcal {L}_{\textit {LP}}\) of well-formed formulas built with a set A={A 1,A 2,A 3… } of atomic propositions using the usual Boolean connectives ¬, , and →. A truth assignment (or valuation) ω is a function ω:A→{0,1} that takes atomic propositions to truth values, which we denote by 0 and 1. Through the classical semantics of propositional logic1 ω can have its domain extended to \(\mathcal {L}_{\textit {PL}}\), assigning a truth value to any propositional formula. If ϕ is true, write ωϕ; if ϕ is false, we write ωϕ. Given a finite a set of formulas \(\{\phi _{1},\dots \,\phi _{q}\}\subset \mathcal {L}_{\textit {PL}}\), the satisfiability (SAT) problem is to determine whether or not there exists a truth assignment to all variables such that all sentences evaluate to true [5, 12]. We call such a set a SAT instance.

If every sentence ϕ i is a conjunction of clauses, then we have a SAT instance in CNF. A SAT instance in CNF is a k-SAT instance when each clause has k literals. The 2-SAT problem has a polynomial solution, while k-SAT is NP-complete for k>2.

Suppose now we have q propositional sentences over n atomic propositions, say \(\phi _{1},\dots,\phi _{q} \in \mathcal {L}_{\textit {LP}}\), associated with probabilities through assessments of the form P(ϕ i )α i , where is one of ≥, = or ≤. For computability, let the probabilities α i be rational numbers. The semantics of such an assessment is as follows. Take the set of 2n truth assignments that can be generated for the n propositions. A probability measure P over this set satisfies the assessments if, for each assessment P(ϕ i )α i ,

$$ \sum_{\omega \models \phi_{i}} P(\omega) \bowtie \alpha_{i}. $$
((1))

The probabilistic satisfiability (PSAT) problem is to determine whether a given set of probabilistic assessments (a PSAT instance) can be satisfied in the sense that there is a probability measure over truth assignments such that all assessments are satisfied. PSAT is know to be in NP, as it has a small model, and since SAT is a subproblem where all assigned probabilities are 1, PSAT is NP-complete [13]. A few polynomial special cases of PSAT are known [1].

Example 4 (Friends’ Favorite Bar).

Consider Problem 1, with friends {A,B,C}. Let ϕ i represent that person i is at the “Bar Phi” tonight, i{A,B,C}. As exactly two friends are going to the “Bar Phi”, according to David, no two friends can be absent tonight, represented by ¬(¬ϕ i ¬ϕ j ) with 100 % certainty, for ij:

$$P\left(\phi_{A} \lor \phi_{B}\right) = P\left(\phi_{A} \lor \phi_{C}\right) = P\left(\phi_{B}\lor \phi_{C}\right) = 1. $$

Furthermore, at least one must be at “Bar Not Phi”:

$$P\left(\lnot \phi_{A} \lor \lnot \phi_{B} \lor \lnot \phi_{C}\right)=1. $$

David assigns the probability for each friend being at “Bar Phi”:

$$P(\phi_{A}) = 6/7,\,P(\phi_{B}) = 5/7,\, P(\phi_{C}) = 4/7, $$

and the question is if there exists a probability measure that simultaneously satisfies these seven probability assignments. It happens that there is no such a probability assignment and this PSAT instance is unsatisfiable. We can conclude that David is being inconsistent in his probabilistic assertions. □

There are many proposed algorithms for PSAT. The most obvious one is to write down q constraints of the form (1), one for each sentence, associated with assignments P(ϕ i )α i . Each constraint can be written as

$$ \sum_{j=1}^{2^{n}} \omega_{j}(\phi_{i}) P(\omega_{j}) \bowtie \alpha_{i}, $$
((2))

while truth assignments ω j are ordered from 1 to 2n (say by the n-bit binary number obtained by writing 0 for false and 1 for true as assigned to A 1,…,A n ). Probabilistic satisfiability is then obtained when the resulting set of linear constraints has a solution. The challenge is that we have 2n truth assignments, so the size of the linear constraints is exponential in the input.

The most efficient algorithms for PSAT apply linear programming techniques to this set of constraints. As a linear program, if there is a solution, there is one with no more than q+1 truth assignments with positive probability [13]. The two-phase simplex method can be used with the addition of q+1 artificial variables in order to find a feasible solution in the first phase. Starting with a basis with these q+1 artificial variables, at each iteration, a new column (variable) enter the basis, keeping the solution feasible, until the basis has no artificial variables. If this point is reached, a solution is found; otherwise, the linear program has no feasible solution and the PSAT instance is unsatisfiable. As the number of columns is exponential, column generation techniques are used. A good survey on this approach is [17]. Combining inference rules with linear programming techniques leads to the currently most efficient algorithms, as showed in ref. [18].

The problem

The language we contemplate is syntactically easy to specify: simply take Boolean operations over assessments, and allow a subformula in an assessment to be itself any well-formed formula of the language. That is, we allow sentences such as

$$\left(P\left(A_{1} \lor \lnot A_{2}\right)\!\geq\!1/2\right) \lor \lnot\left(P\left(A_{3}\right)\!\geq\!1/4\right) \lor (P(A_{1})\,=\,1/5), $$

and even sentences where probabilities are nested as follows:

$$\lnot\left(P(A_{1})\leq1/3\right) \lor P\left(A_{2} \land \left(P(A_{3})\geq 3/4\right)\right)\geq 1/2, $$

both of which contain a single “clause” of (possibly negated) assessments. Another example is the following, where we find two “clauses” of assessments:

$$\begin{array}{@{}rcl@{}} \left(\left(P\left(A_{1}\right)=1/3\right) \lor \lnot \left(P(A_{2})\leq 3/5\right)\right) & \land & \\ & & \left(\left(P\left(\lnot A_{1}\right) \geq 2/3\right) \lor \lnot\left(P\left(A_{1} \lor \lnot A_{3}\right) = 3/7\right)\right). \end{array} $$

To make sense of these sentences, we must establish an appropriate semantics. Note that in any PSAT instance, the truth value of an assessment such as P(ϕ i )=α i is derived from a probability measure on the truth assignments \(\omega _{1},\dots,\omega _{2^{n}}\) (because the relation ω j ϕ i is well defined). This is not the case when nested probabilities are allowed and ϕ i may contain probabilistic subformulas. This next section shows how to define a semantics that characterizes when a probabilistic assessment is true in a possible world ω j .

Syntax and semantics

We start with an infinite set of atomic (or primitive) propositions A={A 1,A 2,A 3,… }.

Our language is then defined recursively as the smallest set \(\mathcal {L}\) such that:2

  • If ϕA, then \(\phi \in \mathcal {L}\);

  • If \(\phi \in \mathcal {L}\), then \(\lnot \phi \in \mathcal {L}\);

  • If \(\phi \in \mathcal {L}\) and \(\theta \in \mathcal {L}\), then \((\phi \lor \theta) \in \mathcal {L}\);

  • If \(\phi \in \mathcal {L}\) and \(\theta \in \mathcal {L}\), then \((\phi \land \theta) \in \mathcal {L}\);

  • If \(\phi \in \mathcal {L}\), then \((P(\phi)\bowtie \alpha) \in \mathcal {L}\), for {≤,≥} and \(\alpha \in [0,1]\cap \mathbb {Q}\).

Parentheses are omitted whenever possible. As usual, P(ϕ)=α, P(ϕ)<α and P(ϕ)>α are abbreviations for (P(ϕ)≤α)(P(ϕ)≥α), ¬(P(ϕ)≥α) and ¬(P(ϕ)≤α), respectively.

To give truth values to formulas in \(\mathcal {L}\), we define a structure \(\mathcal {M}=(\Omega,P(.))\), where Ω is a finite set of propositional truth assignments ω i (valuations), and P:Ω→[0,1] is a probability measure over Ω. With this machinery in hand, we now define when a pair \((\mathcal {M},\omega)\) satisfies a formula \(\phi \in \mathcal {L}\), or \((\mathcal {M},\omega) \models \phi \):

  • \((\mathcal {M},\omega) \models A_{i}\) if ωA i ;

  • \((\mathcal {M},\omega)\models \lnot \phi \) if \((\mathcal {M},\omega)\not \models \phi \);

  • \((\mathcal {M},\omega)\models \phi _{1} \land \phi _{2}\) if \((\mathcal {M},\omega)\models \phi _{1}\) and \((\mathcal {M},\omega)\models \phi _{2}\);

  • \((\mathcal {M},\omega)\models \phi _{1} \lor \phi _{2}\) if \((\mathcal {M},\omega)\models \phi _{1}\) or \((\mathcal {M},\omega)\models \phi _{2}\);

  • \((\mathcal {M},\omega)\models P(\phi)\bowtie \alpha \) if \(\sum \{P(\omega _{i})|(\mathcal {M},\omega _{i})\models \phi \}\bowtie \alpha \), for {≤,≥} and \(\alpha \in [0,1]\cap \mathbb {Q}\);

This semantics leads propositional formulas and probabilistic formulas to have their truth value associated to possible worlds \((\mathcal {M},\omega)\). This means one can talk about the truth value of formulas such as

$$\phi_{1}\land (P(\phi_{2})\leq 0.5) \lor (P(\lnot (P(\phi_{3})\geq 0.7))\leq 0.2)</p><p class="noindent">$$

in a given pair \((\mathcal {M},\omega)\). However, as the truth value of P(ϕ)α in \((\mathcal {M},\omega)\) does not depend on ω, for a fixed \(\mathcal {M}\), P(ϕ)α will be true either in all pairs \((\mathcal {M},\omega)\) or in none of them – this captures the intuition that nested probabilities collapse to either 1 or 0. We write \(\mathcal {M}\models \phi \) if \((\mathcal {M},\omega)\models \phi \) for all ωΩ.

Given a formula \(\phi \in \mathcal {L}\), we say that it is satisfiable if there is a pair \((\mathcal {M},\omega)\), where \(\mathcal {M}=\{\Omega,P(.)\}\) and ωΩ, such that \((\mathcal {M},\omega)\models \phi \); otherwise, we say ϕ is unsatisfiable.

The generalized probabilistic satisfiability (GPSAT) is then the problem of deciding whether or not a given \(\phi \in \mathcal {L}\) is satisfiable. In this way, any formula \(\phi \in \mathcal {L}\) is a GPSAT instance.

Example 5 (Friends Telling the Truth).

Consider Problem 2. Using the same formalization as in Example 4, at least two friends must be at “Bar Phi” tonight:

$$\begin{array}{@{}rcl@{}} P(\phi_{A} \lor \phi_{B}) = 1 \end{array} $$
((3))
$$\begin{array}{@{}rcl@{}} P(\phi_{A} \lor \phi_{C}) = 1 \end{array} $$
((4))
$$\begin{array}{@{}rcl@{}} P(\phi_{B}\lor \phi_{C}) = 1. \end{array} $$
((5))

Furthermore, at least one must be at “Bar Not Phi”:

$$ P(\lnot \phi_{A} \lor \lnot \phi_{B} \lor \lnot \phi_{C})=1. $$
((6))

We want to check if David is being consistent, but we do not know if the probabilities he is claiming correspond to “Bar Phi” or “Bar Not Phi”:

$$\begin{array}{@{}rcl@{}} (P(\phi_{A}) = 6/7\,\lor \,P(\lnot \phi_{A}) = 6/7), \end{array} $$
((7))
$$\begin{array}{@{}rcl@{}} (P(\phi_{B}) = 5/7\,\lor\, P(\lnot\phi_{B}) = 5/7), \end{array} $$
((8))
$$\begin{array}{@{}rcl@{}} (P(\phi_{C}) = 4/7 \,\lor \, P(\lnot\phi_{C}) = 4/7). \end{array} $$
((9))

Now the question is if there exists a probability measure that simultaneously satisfies the first four probability assignments together with at least one probability assignment per each clause above. More precisely, our GPSAT instance is a formula θ, whose satisfiability we want to check, built as the conjunction of the formulas in Expressions (3) to (9).

Consider a structure \(\mathcal {M}=(\Omega,P(.))\), in which Ω={ω 1,ω 2,ω 3} is a set of possible worlds and P:Ω→[0,1] is a probability measure over it. Let Ω be such that ω 1ϕ A ϕ B ¬ϕ C , ω 2ϕ A ¬ϕ B ϕ C and ω 3¬ϕ A ϕ B ϕ C . Make P(ω 1)=4/7, P(ω 2)=2/7 and P(ω 3)=1/7. Note that in each world ωΩ, exactly two of ϕ A , ϕ B , and ϕ C are true, so

$$\mathcal{M}\models P(\phi_{i} \lor \phi_{j}) = 1\, {,} $$

for all pairs ij, satisfying the conjuncts of Expressions (3) to (5). As ωϕ A ¬ϕ B ¬ϕ C ) for all ωΩ,

$$\mathcal{M}\models P(\lnot\phi_{A}\lor\lnot\phi_{B}\lor\lnot\phi_{C}) = 1\,.$$

Finally, we have \(\sum \{P(\omega _{i})|(\mathcal {M},\omega _{i})\models \phi _{A}\}=6/7\), \(\sum \{P(\omega _{i})|(\mathcal {M},\omega _{i})\models \phi _{B}\}=5/7\) and \(\sum \{P(\omega _{i})|(\mathcal {M},\omega _{i})\models \lnot \phi _{C}\}=4/7\), yielding:

$$\begin{array}{@{}rcl@{}} \mathcal{M}\models P(\phi_{A})=6/7\,{,}\\ \mathcal{M}\models P(\phi_{B})=5/7\,{,}\\ \mathcal{M}\models P(\lnot\phi_{C})=4/7\,{,} \end{array} $$

satisfying the clauses in Expressions (7), (8), and (9), respectively. Hence, \(\mathcal {M}\models \theta \) and θ is satisfiable.

We conclude that David is being consistent, with a possible scenario being with only Charlie preferring the “Bar Not Phi”. Indeed, every structure satisfying θ leads to this scenario. Any other assignment of favorite bars for Alice, Bob, and Charlie different from this one (only Charlie preferring “Bar Not Phi”) makes the probabilities inconsistent. Therefore, it can be inferred that this scenario is the case, given that David is consistent. □

Note that this problem could be solved as a sequence of PSAT instances: we just guess the favorite bar of each friend and decide the corresponding PSAT instance. But to cover all possible assignments of favorite bars leads to an exponential quantity (on the number of friends) of PSAT instances, thus it is a less efficient solution. As we shall see, GPSAT is in NP, and there must exist a polynomial reduction from it to PSAT, but there seems to be no simple way of performing it.

The logic presented by Fagin, Halpern, and Megido in ref. [9] had already dealt with Boolean combinations of probabilistic assessments, but they kept the probabilities applied only to pure propositional formulas, hence avoiding nesting. Their semantics assigns truth values only to probability assessments through a whole structure \(\mathcal {M}\), and not a pair \((\mathcal {M},\omega)\). So their semantics is closer to PSAT’s, even though they use probability “inside” the language. Another difference is that Fagin et al. allow more general assessments of the form a 1 P(α 1)+a 1 P(α 1)++a m P(α m )≥α (this enabled them to axiomatize their logic). They show that the corresponding satisfiability problem is NP-complete, but do not propose algorithms to solve it. One of our goals here is to provide concrete algorithms.

In another work, Fagin and Halpern [10] investigated a more general logic to reason about knowledge and probabilities. Its probabilistic semantics is similar to ours because the probabilistic formulas have truth value in specific states (worlds), but they introduce a probability distribution for each set of indistinguishable possible worlds (states). Additionally, their logic has an epistemic modal relation, so that multi-agent and linear combinations of probabilities are also allowed. Again, they axiomatize their logic, show the complexity of the related decision procedure, but do not provide algorithms. Variations of logics adding probabilities to the propositional language can be found in ref. [7].

The logic we propose here can be seen as a particular case of the logic in ref. [10], where there is only one agent and all possible worlds are indistinguishable (all with identical probability distributions). It follows that GPSAT inherits an NP upper bound proved by Fagin and Halpern (Theorem 4.6 in ref. [10]); and as PSAT is a subproblem of GPSAT, the latter is also NP-complete.

GPSAT normal form

The algorithm for GPSAT to be proposed decides satisfiability only for formulas without nested probabilities and with all propositional formulas in a probability assessment (like in PSAT). That is, the algorithm cannot handle a formula such as P(P(ϕ)≥0.2)≤0.1 or ϕ 1P(ϕ 2)≤0.5. As we now show, any GPSAT instance can be reduced in polynomial time to a normal form that complies with these constraints and preserves satisfiability.

We say a formula \(\phi \in \mathcal {L}\) is in normal form if it is the conjunction of two formulas ΨΓ, where

  • Ψ is the conjunction of 3-clauses over probabilistic assessments in the form P(A i )≥α, in which each assessment is over a different atomic proposition A i , and

  • Γ is a probabilistic assignment P(γ)≥1, in which γ is a conjunction of propositional 3-clauses.

Note that probability values that are smaller than one can be assigned only to atomic propositions, and additionally, every atomic proposition occurrence is a subformula of no more than one probabilistic assessment in Ψ.

An example of a formula in normal form is:

$$\begin{aligned} {}\left(\left(P(A_{1})\geq 0.3 \right)\lor \lnot\left(P\left(A_{2}\right)\geq 0.1 \right)\lor \lnot \left(P(A_{3})\geq 0.4\right)\right) & \land & \\ & & \left(P\left(\left(A_{1} \lor \lnot A_{2} \lor A_{3}\right)\land\left(A_{1} \lor A_{2} \!\lor \!\lnot A_{3}\right)\right)\geq 1\right). \end{aligned} $$

This normal form is based on the PSAT normal form introduced by Finger and De Bona [11], and, although it may seem quite restrictive, we can show that all formulas in \(\mathcal {L}\) can be brought to the normal form. Before we apply techniques from Finger and De Bona, a sequence of intermediate results is needed.

A formula in normal form has no nested probabilities nor propositions outside the scope of a probability assignment. A particular case of a result in ref. [7] (Theorem 1) can be used to eliminate these undesired subformulas. First, we introduce an auxiliary notation. We call a probabilistic atom any formula P(ϕ)α in \(\mathcal {L}\) such that \(\phi \in \mathcal {L}_{\textit {PL}}\) is purely propositional. Let \(\mathcal {L}'\subset \mathcal {L}\) be the language formed by the Boolean closure of probabilistic atoms:

  • If ϕ is a probabilistic atom, then \(\phi \in \mathcal {L}'\);

  • If \(\phi \in \mathcal {L}'\), then \(\lnot \phi \in \mathcal {L}'\);

  • If \(\phi \in \mathcal {L}'\) and \(\theta \in \mathcal {L}'\), then \((\phi \lor \theta) \in \mathcal {L}'\);

  • If \(\phi \in \mathcal {L}'\) and \(\theta \in \mathcal {L}'\), then \((\phi \land \theta) \in \mathcal {L}'\).

Now we can state the result we need:

Lemma 6.

For every \(\phi \in \mathcal {L}\), there exists \(\theta \in \mathcal {L}'\), such that ϕ is satisfiable if, and only if, θ is; furthermore, θ is computed in polynomial time.

Proof Sketch.

Let I be the set of indexes of all atomic propositions A i occurring in ϕ outside the scope of a probability assignment. To build ϕ 0, for all iI, substitute P(B i )≥1, where B i is a fresh atomic proposition, for all occurrences of A i out of a probability assessment; this is done in linear time in the size of ϕ.

Given ϕ 0, construct ϕ0′ by substituting a new atomic proposition C i for a expression P(ψ)α ( is ≤ or ≥) inside the scope of probability assessment. Define \(\phi _{1}=\phi _{0}'\land ((P(C_{i})\geq 1)\lor P(C_{i})\leq 0))\land (\lnot (P(C_{i})\geq 1) \lor (P(\psi)\bowtie \alpha))\land ((P(C_{i})\geq 1) \lor \lnot (P(\psi)\bowtie \alpha))\). Note that ϕ 1 can be computed in polynomial time. Construct ϕ m+1 from ϕ m until there is no more nesting of probabilities, obtaining \(\theta \in \mathcal {L}'\). As the number of nested probabilities has a linear upper bound on the size of ϕ, the whole process of building θ is polynomial in time. To verify that θ is satisfiable iff ϕ is, see the complete proof (Theorem 1 in ref. [7]).

So, by using Lemma 6, we have a formula which is a Boolean combination of probability assignments over pure propositional formulas. If each probabilistic atom of a formula ϕ is replaced by a new atomic proposition B i , then we have a formula θ from classical propositional logic. Using standard techniques, by adding new atoms, we can build a 3-SAT instance θ which is (Boolean) satisfiable iff θ is. Replace the atomic propositions B i by the corresponding probabilistic atoms and the new atomic propositions by probabilistic atoms of the form P(C i )≥1, where C i a fresh atomic proposition. Now we have a GPSAT instance ϕ that is satisfiable only if ϕ is. For our normal form transformation procedure, we start from a formula like ϕ , which is the conjunction of 3-clauses, each formed from probabilistic atoms or their negation using only ≥. Observe that a probability assessment in the form P(ψ i )≤α i is equivalent to the assignment Pψ i )≥1−α i .

Theorem 7.

For all formulas \(\phi \in \mathcal {L}\), there is a formula \(\theta \in \mathcal {L}\) in normal form that is satisfiable iff ϕ is; θ can be computed in polynomial time.

Proof.

Given a formula ϕ with 3-clauses of probabilistic atoms, we construct Ψ from ϕ, and Γ from scratch. For each P(ψ i )≥α i in ϕ, substitute a new atomic proposition B i for ψ i to construct Ψ. Let Γ be assignment P(γ)≥1, where γ is the conjunction of the clauses corresponding to (ψ i )B i , (¬ψ i B i )(ψ i ¬B i ), for all B i introduced. Using propositional techniques again, one can transform γ into a 3-SAT in polynomial time, possibly adding fresh atomic propositions. We have built θ=ΨΓ in polynomial time, it remains to prove that θ is satisfiable iff ϕ is.

(←) Suppose \((\mathcal {M},\omega _{j*})\models \phi \), with \(\mathcal {M}=(\Omega,P(.))\). We can extend \(\mathcal {M}\) to satisfy θ. For each ω j Ω, make ω j B i iff \((\mathcal {M},\omega _{j*})\models \psi _{i}\) to form Ω . Create a structure \(\mathcal {M'}=(\Omega ',P(.))\). Observe that \((\mathcal {M'},\omega _{j*})\models P(B_{i})\geq \alpha _{i}\) iff \((\mathcal {M'},\omega _{j}\models)P(\psi _{i})\geq \alpha _{i}\), for all i. It follows that \((\mathcal {M'},\omega _{j*})\models \Psi \). As ϕ i B i holds in all worlds ω j for all i, \((\mathcal {M'},\omega _{j*})\models P(\gamma)\geq 1\). Thus \((\mathcal {M'},\omega _{j*})\models \theta \).

(→) Now suppose \((\mathcal {M},\omega _{j*})\models \theta \), with \(\mathcal {M}=(\Omega,P(.))\). For all ω j Ω with P(ω j )>0, since \((\mathcal {M},\omega _{j*})\models P((\lnot \psi _{i}\lor B_{i})\land (\psi _{i}\lor \lnot B_{i}))\geq 1\), \((\mathcal {M},\omega _{j})\models B_{i}\) iff \((\mathcal {M},\omega _{j})\models \psi _{i}\), for all i. Hence \((\mathcal {M},\omega _{j*})\models P(B_{i})\geq \alpha _{i}\) iff \((\mathcal {M},\omega _{j*})\models P(\psi _{i})\geq \alpha _{i}\), for all i. Finally, \((\mathcal {M},\omega _{j*})\models \phi \).

Note that a GPSAT instance in normal form has no proposition outside the scope of a probabilistic assignment. This means that if \((\mathcal {M},\omega _{j*})\models \phi \), for some \(\mathcal {M}=(\Omega,P(.))\) and ϕ=ΨΓ in normal form, then \((\mathcal {M},\omega _{j*})\models \phi \) for all ω j Ω. We may write simply \(\mathcal {M}\models \phi \) in such case.

The normal form allows us to see a GPSAT instance ΨP(γ)=1 as an interaction between a probability problem (represented by Ψ) and a SAT instance γ. Solutions to the instance can be seen as solutions to Ψ constrained by the SAT instance γ. This is formalized as follows.

Lemma 8.

A normal form GPSAT instance ϕ=ΨP(γ)=1, with q clauses in Ψ, is satisfiable iff there is a structure \(\mathcal {M}=(\Omega,P(.))\) such that |Ω|≤q+1, \(\mathcal {M}\models \Psi \) and, for all ωΩ, and ωγ.

Proof.

(←) Suppose first there is a structure \(\mathcal {M}=(\Omega,P(.))\), such that |Ω|≤q+1, \(\mathcal {M}\models \Psi \) and, for all ωΩ, and ωγ. Trivially, \(\mathcal {M}\models P(\gamma)=1\) and \(\mathcal {M}\models \phi \).

(→) Now suppose that ϕ=ΨP(γ)=1 is satisfiable. Then there is a pair \(\mathcal {M}=(\Omega,P(.))\) such that (Ω,P(.))ΨP(γ)=1. Without loss of generality, we can suppose that P(ω)>0 for all ωΩ—one could just rule out any ωΩ with zero probability. If there is a ωΩ such that ω¬γ, then (Ω,P(.))Pγ)>0 and thus (Ω,P(.))P(γ)=1, a contradiction. So, ωγ for all ωΩ. As (Ω,P(.))Ψ, (Ω,P(.)) satisfies at least one probabilistic atom per clause in Ψ. Let \(P(A_{i_{j}})\bowtie _{i_{j}}\alpha _{i_{j}}\) be a probabilistic assignment satisfied in the jth-clause, with \(\bowtie _{i_{j}} \in \{\geq,<\}\), so \((\Omega,P(.))\models P(A_{i_{j}})\bowtie _{i_{j}} \alpha _{i_{j}}\), for 1≤jq. Suppose \(\mathcal {M}=(\Omega,P(.))\) is such that \((\Omega,P(.))\models P(A_{i_{j}})=\beta _{i_{j}}\), for 1≤jq. Using Caratheodory’s Lemma [8], there is a \(\mathcal {M}'=(\Omega ',P'(.))\) such that Ω Ω, |Ω |≤q+1 and \((\Omega ',P'(.))\models P(A_{i_{j}})=\beta _{i_{j}}\) for all 1≤jq. Hence, \(\mathcal {M}'\models P(A_{i_{j}})\bowtie _{i_{j}}\alpha _{i_{j}}\) for 1≤jq and \(\mathcal {M}'\models \Psi \). Finally, as Ω Ω, ωγ for all ωΩ

A multi-agent setup

The language of our logic can be extended to deal with more than one probability measure. This may be useful in the case where different agents might assess different probability measures to the same set of events. For instance, we can modify Problem 1 to consider three probability measures, one for each friend.

Problem 9.

Three friends, Alice, Bob, and Charlie, go to the pub everyday and they have two options: “Bar Phi” and “Bar Not Phi”. Each friend goes exactly to one bar per night. Talking about tonight, each friend has its own probability assessments: Alice says that she goes to “Bar Phi” with probability 6/7, Bob tells he goes to “Bar Phi” with probability 5/7; and Charlie states that the probability of him drinking at “Bar Phi” is 4/7. Furthermore, the three friends agree that exactly two of them will be at “Bar Phi” tonight.

Question: Are Alice, Bob, and Charlie being consistent?

While in Problem 1, there is one probability measure (David’s), in Problem 9, Alice, Bob, and Charlie have different opinions about the probabilities, and each one states his own belief. To formalize this situation, we extend the language of GPSAT to consider the case with N probability measures, so that probability assignments may constrain the values of one out of the N different probability measures. We call this language \(\mathcal {L}_{N}\), and its syntax is identical to \(\mathcal {L}\)’s except from the construction of the probability assessment:

  • If \(\phi \in \mathcal {L}\), then \((P_{i}(\phi)\bowtie \alpha) \in \mathcal {L}\), for {≤,≥}, \(\alpha \in [0,1]\cap \mathbb {Q}\) and i{1,2,…N}.

To give truth values to formulas in \(\mathcal {L}_{N}\), we define a structure \(\mathcal {M}=(\Omega,\Pi)\), where Ω is a finite set of propositional truth assignments ω i (as before), and Π is a set of N probability measures P i :Ω→[0,1] for 1≤iN. For a formula \(\phi \in \mathcal {L}_{N}\), we change the definition of \((\mathcal {M},\omega) \models \phi \) by modifying only the semantics of the probability assignment:

  • \((\mathcal {M},\omega)\models P_{i}(\phi)\bowtie \alpha \) if \(\sum \{P_{i}(\omega _{j})|(\mathcal {M},\omega _{j})\models \phi \}\bowtie \alpha \), for {≤,≥}, \(\alpha \in [0,1]\cap \mathbb {Q}\) and i{1,2,…N}.

It is clear that, if N=1, the former GPSAT logic is recovered. Using the language \(\mathcal {L}_{3}\), we can model Problem 9:

Example 10 (Friends’ Favorite Bar Revisited).

Let ϕ i , with i{A,B,C}, be the same as Example 4. Let P 1, P 2, and P 3 be the probability measures that Alice, Bob, and Charlie, respectively, assign over the possible worlds. We then assign, for each friend, the probability of him being “Bar Phi” tonight, constraining his own probability measure:

$$P_{1}(\phi_{A}) = 6/7,\,P_{2}(\phi_{B}) = 5/7,\, P_{3}(\phi_{C}) = 4/7, $$

As they agree that two friends are going to the “Bar Phi”, the formulas ¬(¬ϕ i ¬ϕ j ), for ij, must have 100 % probability in all three probability measures:

$$P_{i}(\phi_{A} \lor \phi_{B}) = P_{i}(\phi_{A} \lor \phi_{C}) = P_{i}(\phi_{B}\lor \phi_{C}) = 1, $$

for i{1,2,3}. Furthermore, since they agree that at least one must be at “Bar Not Phi”:

$$P_{i}(\lnot \phi_{A} \lor \lnot \phi_{B} \lor \lnot \phi_{C})=1, $$

for all i{1,2,3}.

The question is if there are three probability measures (P 1, P 2, and P 3) that satisfy these 15 probability assignments. We show that it is the case. Let Ω be such that ω 1ϕ A ϕ B ¬ϕ C , ω 2ϕ A ¬ϕ B ϕ C and ω 3¬ϕ A ϕ B ϕ C . Let Π={P 1,P 2,P 3}; and P 1(ω 1)=6/7, P 1(ω 3)=1/7, P 2(ω 2)=2/7, P 2(ω 3)=5/7, P 3(ω 1)=3/7, P 3(ω 2)=4/7, and P 1(ω 2)=P 2(ω 1)=P 3(ω 3)=0. As in Example 4, within each ωΩ, the formulas stating that exactly two friends are going to “Bar Phi” tonight are all satisfied, so they have probability one for the three friends:

$$(\Omega,\Pi)\models P_{i}(\phi_{j} \lor \phi_{k})= 1, $$
$$(\Omega,\Pi)\models P_{i}\left(\lnot \phi_{A} \lor \lnot \phi_{B} \lor \lnot \phi_{C}\right)=1, $$

for all pairs jk and i{1,2,3}. Additionally, it is the case that

$${}(\Omega,\Pi)\models P_{1}(\phi_{A}) = 6/7\land P_{2}(\phi_{B}) = 5/7\land P_{3}(\phi_{C}) = 4/7. $$

Differently from Example 4, the three friends are each one consistent. However, they do not agree with each other’s probabilities; for instance, P 1(ϕ A )=6/7≠2/7=P 2(ϕ A ). □

In practice, Example 10 can be seen as three separated GPSAT instances. As the three probability measures are independent from each other, the consistency check can take place within the probability assignments of the same agent. Alice’s probability measure has to satisfy only P(ϕ A )=6/7, Pϕ A ¬ϕ B ¬ϕ C )=1 and P(ϕ i ϕ j )=1 for ij. Analogously, Bob’s probability measure must model P(ϕ B )=5/7, and Charlie’s, P(ϕ C )=4/7, each together with the constraint that exactly 2 of the friends are going to “Bar Phi” tonight. Indeed, this correspondence between satisfiability of formulas in \(\mathcal {L}_{N}\) and GPSAT can be formalized as a polynomial reduction. To prove that, we depart from formulas \(\phi \in \mathcal {L}_{N}\) without nesting, as it can be eliminated with a slight modification of Lemma 6:

Lemma 11.

For every \(\phi \in \mathcal {L}_{N}\), for a fixed \(N\in \mathbb {N}\), there exists a θL N in which all probabilistic formulas are not subformulas of another probabilistic formulas, such that ϕ is satisfiable if, and only if, θ is. Furthermore, θ is computed in polynomial time and \((\mathcal {M},w)\models \theta \) implies \((\mathcal {M},w)\models \phi \).

Proof.

To prove by induction, we show how to decrease the number of nested probabilities, keeping the satisfiability and the connection between the models. Given a formula ϕ with nested probabilities, construct ϕ by substituting a new atomic proposition B for a basic probabilistic formula P i (ψ)α, that is subformula of another basic probabilistic formula P j (ψ ) α , with 1≤i,jN and ,{≤,≥}. Define \(\phi ''=\phi '\land ((P_{j}(B)\geq 1)\lor (P_{j}(B)\leq 0))\land (\lnot (P_{j}(B)\geq 1) \lor (P_{i}(\psi)\bowtie \alpha))\land ((P_{j}(B)\geq 1) \lor \lnot (P_{i}(\psi)\bowtie \alpha))\). Clearly, this can be done in polynomial time. Now we need to prove that ϕ is satisfiable iff ϕ is.

(←) Suppose \((\mathcal {M},\omega ^{*})\models \phi \), with \(\mathcal {M}=(\Omega,\Pi)\). We can change \(\mathcal {M}\) to satisfy ϕ . For each ωΩ, make ωB iff \((\mathcal {M},\omega ^{*})\models P_{i}(\psi)\bowtie \alpha \) to form Ω . Create a structure \(\mathcal {M'}=(\Omega ',\Pi)\). If \((\mathcal {M},\omega ^{*})\models P_{i}(\psi)\bowtie \alpha \), then \((\mathcal {M'},\omega)\models B\) for all ωΩ and \((\mathcal {M'},\omega ^{*})\models P_{j}(B)\geq 1\). Else, \((\mathcal {M'},\omega _{j})\models \lnot B\) for all ωΩ , and \((\mathcal {M'},\omega ^{*})\models P_{j}(B)\leq 1\). Anyway, \((\mathcal {M'},w^{*})\models ((P_{j}(B)\geq 1)\lor (P_{j}(B)\leq 0))\land (\lnot (P_{j}(B)\geq 1) \lor (P_{i}(\psi)\bowtie \alpha))\land ((P_{j}(B)\geq 1) \lor \lnot (P_{i}(\psi)\bowtie \alpha))\). Furthermore, as \((\mathcal {M'},\omega)\models P_{i}(\psi)\bowtie \alpha \) iff \((\mathcal {M'},\omega)\models B\), for all ωΩ , it is the case that \((\mathcal {M'},\omega ^{*})\models \phi '\). Hence, \((\mathcal {M'},\omega ^{*})\models \phi ''\).

(→) Suppose now \((\mathcal {M},\omega ^{*})\models \phi ''\). Note that the last two clauses in ϕ state that P j (B)≥1P i (ψ)α. Therefore, \((\mathcal {M},\omega ^{*})\models P_{j}(B)\geq 1\) iff \((\mathcal {M},\omega ^{*})\models P_{i}(\psi)\bowtie \alpha \). But \((\mathcal {M},\omega ^{*})\models P_{j}(B)\geq 1\) iff \((\mathcal {M},\omega)\models P_{j}(B)\geq 1\) for every ωΩ; and \((\mathcal {M},\omega ^{*})\models P_{i}(\psi)\bowtie \alpha \) iff \((\mathcal {M},\omega)\models P_{i}(\psi)\bowtie \alpha \) for every ωΩ. Hence, for every ωΩ, \((\mathcal {M},\omega)\models P_{j}(B)\geq 1\) iff \((\mathcal {M},\omega)\models P_{i}(\psi)\bowtie \alpha \). Then, due to the clause ((P j (B)≥1)P j (B)≤0)), for every ωΩ with π j (ω)>0, we have that \((\mathcal {M},\omega)\models B\) iff \((\mathcal {M},\omega)\models P_{i}(\psi)\bowtie \alpha \). Finally, as \((\mathcal {M},\omega _{*})\models \phi '\), \((\mathcal {M},\omega ^{*})\models \phi \).

We now prove the desired reduction.

Theorem 12.

Given a formula \(\phi \in \mathcal {L}_{N}\), for a fixed \(N\in \mathbb {N}\), there is a formula \(\theta \in \mathcal {L}\) such that ϕ is satisfiable if, and only if, θ is; furthermore, θ is computed in polynomial time.

Proof.

Using Lemma 11, we first eliminate nested probabilities from ϕ, keeping its (un)satisfiability, and this is done in polynomial time. Suppose, without lost of generality, that ϕ is built using n atomic propositions A 1,A 2…,A n . To construct θ from ϕ, for each i{1,2,…N}, transform any probabilistic atom P i (φ)α into P(φ i)α, in which φ i is constructed from φ by replacing each A j by A (i−1)n+j . Note that \(\theta \in \mathcal {L}\) has N disjunct sets of n atomic propositions, and each set is used to build formulas whose probability was constrained in ϕ for a different probability measure. Clearly, θ is computed in polynomial time on the length of ϕ. It remains to prove that θ is satisfiable if, and only if, ϕ is.

(→) Suppose θ is satisfiable, so there is a structure \(\mathcal {M}=(\Omega,P(.))\) such that \((\mathcal {M},\omega ^{*})\models \theta \) for some ω Ω. Let \(\Omega '=\{\omega _{1},\omega _{2},\dots,\omega _{2^{n}}\}\) be a set with the all possible valuations on A 1,A 2,…,A n —the atomic propositions in ϕ. For each i{1,2,…,N} and all 1≤j≤2n, make \(P_{i}(\omega _{j})=\sum \{P(\omega)|\omega \in \Omega \text {s.t.} \omega (A_{(i-1)N+k})=\omega _{j}(A_{k}) \text {for all }1\leq k\leq n\}\). Make Π={P 1,P 2,…,P N } and \(\mathcal {M}'=(\Omega ',\Pi)\). It follows that, for any probabilistic atom P i (φ)α in ϕ, \(\mathcal {M}'\models P_{i}(\varphi)\bowtie \alpha \) iff \(\mathcal {M}\models P(\varphi ^{i})\bowtie \alpha \), where P(φ i)α is the corresponding probabilistic atom in θ. Let ω Ω be such that ω (A k )=ω (A k ) for all 1≤kn. Note that every atomic proposition outside of the scope of a probability assignment in θ is satisfied in \((\mathcal {M},\omega ^{*})\) iff it is satisfied in \((\mathcal {M}',\omega '^{*})\). We can conclude that \((\mathcal {M}',\omega '^{*}) \models \phi \).

(←) Now suppose ϕ is satisfiable, so there is a structure \(\mathcal {M}=(\Omega,\Pi)\), with Π={P 1,P 2,…,P N }, such that \((\mathcal {M},\omega ^{*})\models \phi \) for some ω Ω. Without lost of generality, we assume \(\Omega =\{\omega _{1},\omega _{2},\dots,\omega _{2^{n}}\}\), since if less valuations are needed to satisfy ϕ, others can be inserted with zero probability. Make \(\Omega '=\{\omega _{j_{1},j_{2},\dots,j_{N}}|1\leq j_{k}\leq 2^{n}, 1\leq k\leq N\}\) such that \(\omega _{j_{1},j_{2},\dots,j_{N}}(A_{(i-1)n+k})=\omega _{j_{i}}(A_{k})\), for \(\omega _{j_{i}}\in \Omega \). That is, each valuation \(\omega _{j_{1},j_{2},\dots,j_{N}}\in \Omega '\) can be seen as the aggregation of partial valuations \(\omega _{j_{1}},\dots,\omega _{j_{N}}\) over the disjunct sets of atomic propositions, A 1,…,A n , A n+1,…,A 2n , until A (N−1)n+1,…,A Nn . Each partial valuation \(\omega _{j_{k}}\) is a “translation” of \(\omega _{j_{k}}\in \Omega \) to the kth set of atomic propositions A (k−1)n+1,…,A (k−1)n+n . Let \(P(\omega _{j_{1},j_{2},\dots,j_{N}})\) be equal to the product \(P_{1}(\omega _{j_{1}})\times P_{2}(\omega _{j_{2}})\times,\dots,\times P_{N}(\omega _{j_{N}})\). Create a structure \(\mathcal {M}'=(\Omega ',P(.))\). As \(\sum _{j=1}^{2^{n}} P_{i}(\omega _{j})=1\) for all 1≤iN, it follows that (Ω,Π)P k (ϕ)α iff (Ω ,P(.))P(ϕ k)α, in which φ k is constructed from φ by replacing A j by A (k−1)N+j for all 1≤jn. Finally, suppose ω =ω q Ω. There is a \(\omega '*=\omega _{q,j_{2},\dots,j_{N}} \in \Omega '\) such that every atomic proposition outside of the scope of a probability assignment in ϕ is satisfied in \((\mathcal {M},\omega ^{*})\) iff it is satisfied in \((\mathcal {M}',\omega '^{*})\); which yields \((\mathcal {M'},\omega '^{*})\models \theta \), finishing the proof.

This result enables us to focus on how to solve GPSAT, as any formula of \(\mathcal {L}_{N}\), with probability assignments for different probability measures, can be transformed into a GPSAT instance in polynomial time.

Methods

Assume our GPSAT instance is in normal form with q clauses, each with three (possibly negated) assessments {P(A j )≥α j } and a sentence γ in CNF with m clauses, each clause with three literals. So our problem is parameterized by the number of atomic propositions, n, the number of clauses with probability assessments, q, and the number of 3-clauses in γ, m. Such a parameterized normal form neatly separates the probabilistic and the propositional aspects of general probabilistic satisfiability.

Suppose, without loss of generality, the kth clause contains assignments over A 3k−2, A 3k−1, and A 3k . By Lemma 8, if a normal form GPSAT instance is satisfiable, there is a structure with no more than q+1 possible worlds that satisfy it. So we can search for only q+1 valuations, as long as they all satisfy γ. Our problem becomes: find the (q+1) truth assignments ω j satisfying γ and the following disjunction of restrictions, for 1≤kq

$$\begin{array}{@{}rcl@{}} \bigvee_{i=3k-2}^{i=3k}\sum_{j=1}^{q+1}\omega_{j}(A_{i})P(\omega_{j}) \bowtie_{i} \alpha_{i}, & & \\ & & \text{where } \bowtie_{i}\in \{<,\geq\}\,. \end{array} $$
((10))

Hence, we have 3q(q+1) optimization variables (values of ω j (A i ), denoted by a i,j ); all of them are binary with values 0 and 1. Furthermore, we have a probability measure over q+1 truth assignments, represented by the real-valued variables p 1,…,p q+1[0,1], that must sum up to 1. Following the approach of [6], we find {a i,j } and {p j } by solving a mixed integer linear program.

Note that, for each k, (10) is a disjunction of constraints and may contain strict inequalities—both characteristics not common to standard integer linear programming formulations. Since probabilities are bounded, one can eliminate disjunctions of inequalities by adding fresh integer variables, for instance. Strict inequalities like \(\sum _{j=1}^{q+1}\omega _{j}(A_{i})P(\omega _{j}) < \alpha _{i}\) can be replaced by a non-strict one \(\sum _{j=1}^{q+1}\omega _{j}(A_{i})P(\omega _{j}) \leq \alpha _{i} - \epsilon \) for a suitably chosen ε>0. We leave these tasks for the integer programming solver to handle. So, we do not formally reduce GPSAT to mixed integer linear programming, but use techniques of the latter (described for instance in ref. [24]) to solve the former.

The elements a i,j for a fixed j corresponds to a truth assignment that satisfies γ. We explore the well-known connection between SAT and integer programming to find such a truth assignment [5]. Start by generating a vector a j with n binary variables \(\{a_{i,j}\}_{i=1}^{n}\), all with values 0 and 1. Now take one clause of γ; suppose it is written as

$$\left(\vee_{l'=1}^{k'} A_{i_{l'}}\right) \vee \left(\vee_{l^{{\prime}{\prime}}=1}^{k{\prime}{\prime}} \neg A_{i_{l{\prime}{\prime}}}\right). $$

For this clause, generate the linear inequality:

$$ \left(\sum_{l'=1}^{k'} a_{k,i_{l'}}\right) + \left(\sum_{l{\prime}{\prime}=1}^{k{\prime}{\prime}} \left(1-a_{k,i_{l{\prime}{\prime}}}\right) \right) \geq 1. $$
((11))

Consider the m inequalities generated this way (one per clause in γ). A vector a j that satisfies these m inequalities yields a truth assignment ω j for γ such that ω j (A i )=a ij , assigning true to A i when a i,j is one, and assigning false to A i when a i,j is zero.

We generate the truth assignments by generating (q+1) sets of variables a j and their related inequalities. These valuations correspond to the possible worlds ω j Ω, with repetitions allowed3. Now we have to force the probability measure over these worlds to satisfy a probabilistic literal in each clause of Ψ. To do so, note that each probabilistic atom in Ψ that is not negated represents an inequality as follows:

$$ \sum_{j=1}^{q+1} a_{i,j} p_{j} \geq \alpha_{i}, $$
((12))

where p j denotes P(ω j ). When the atom is negated, the constraint becomes:

$$ \sum_{j=1}^{q+1} a_{i,j} p_{j} < \alpha_{i}, $$
((13))

The inequalities in (12) and (13) do not have to hold for all 3q probabilistic atoms. Only one per clause in Ψ must hold. So, we have the following disjunction of constraints for each clause 1≤kq:

$$ \bigvee_{i=3k-2}^{3k}\sum_{j=1}^{q+1} a_{i,j}p_{j} \bowtie_{i}\alpha_{i}, $$
((14))

where i denotes < if the probabilistic atom P(A i )≥α i is negated; otherwise, it denotes ≥. The challenge is to reduce the bilinear term a i,j p j to linear constraints. We do that by introducing a new fresh variable b i,j and the constraints:

$$ {\small{\begin{aligned} 0 \leq b_{i,j} \leq a_{i,j} \qquad \text{and} \qquad a_{i,j}-1+p_{j} \leq b_{i,j} \leq p_{j}. \end{aligned}}} $$
((15))

Note that if a i,j =0, then b i,j =0; and if a i,j =1, then b i,j =p j . The constraints from Expression (14) can be rewritten as:

$$ {\small{\begin{aligned} \bigvee_{i=3k-2}^{3k}\sum_{j=1}^{q+1} b_{i,j} \bowtie_{i}\alpha_{i}, \end{aligned}}} $$
((16))

for each clause 1≤kq and with i {<,≥}.

A last restriction ensures P(.) is a probability measure:

$$ \sum_{j=1}^{q+1} p_{j} =1. $$
((17))

The whole procedure is presented in Algorithm ??; it basically collects constraints from Expressions (11), (15), (16), and (17). Note that variables a i,j {0,1} are restricted to be integers, while b i,j and p j range within the real interval [0,1], leading to an instance of mixed integer linear programming. The algorithm produces a MILP instance that has a solution if and only if the original GPSAT instance is satisfiable.

It remains to prove that the mixed integer program generated has polynomial size on the length of the GPSAT instance and can be generated in polynomial time. Fist, we sum up the number of variables introduced at the line ?? of Algorithm ??:

  • variables a i,j : q+1 valuations times n atomic propositions;

  • variables b i,j : q+1 valuations times 3q atomic propositions in Ψ;

  • variables p j : one per valuation, q+1.

Now we count the constraints created together with the time consumption of the loops in Algorithm ?? that generate them:

  • loop at lines ??–??: (q+1)m inequalities with three variables each;

  • loop at lines ??–??: q disjunctions of three inequalities with q+1 variables each;

  • loop at lines ??–??: (q+1)3q pairs of inequalities on three variables each;

  • command at line ??: 1 equality with q+1 variables.

Summing up, Algorithm ?? generates (m+6q+1)(q+1)+2q=O((m+q)q) inequalities on (n+3q+1)(q+1)=O(n q) variables. Therefore, the resulting mixed integer linear program has polynomial size and is built in polynomial time on the length of the corresponding GPSAT instance. Note that the total running time of Algorithm ?? depends largely on the command at line ??, that calls a solver for the MILP instance; as the decision version of mixed integer linear programming is itself an NP-complete problem, it might take exponential time.

Results and discussion

We have coded our GPSAT method using the Java language with calls to CPLEX version 12, extending the implementation from [6] to solve PSAT, and run experiments in a iMac computer with 4 GBytes of memory.

We were particularly interested in investigating whether phase transition phenomena can be identified in our solutions to GPSAT. Until the recent work of Finger and De Bona [11], there was little evidence of phase transition for PSAT in the literature. Baiotelli et al. [2] have shown empirical results for the 2CPA problem—that is equivalent to the PSAT in normal form with only two literals per clause—in which a typical phase transition shape is seen. However, 2CPA is not exactly a normal form of PSAT, even though it is NP-complete. As our normal form extends Finger and De Bona’s, and GPSAT is also NP-complete, we expect to find evidence of phase transition also for GPSAT. Consequently, we examine the behavior of GPSAT using instances ΨΓ in normal form for various values of m/n for fixed q and n, looking for hard instances instead of randomly trying out large instances that in the end may be easy.

Following the approach from Finger and De Bona [11], we fix the number of clauses in Ψ, q, and the total number of probabilistic literals in Ψ, 3q. Each probability assessment P(A i ) i α i in Ψ was generated in such a way that i {<,≥} and α i [0,1] were chosen under a uniform distribution, discretizing [0,1] using two decimal digits. Clauses with three literals each in γ were randomly generated by selecting atomic propositions randomly out of the n propositions; each literal was negated or not with probability 1/2. We varied the number m of clauses in γ in order to explore the complexity profile of GPSAT for m/n. For each point m/n, 100 random GPSAT instances in normal form were generated, submitted to the implementation of Algorithm ??, and the average computation time was registered together with the percentage of satisfiable formulas. Figures 1, 2, 3, 4, and 5 shows the results of such experiments, for different values of q and n.

Fig. 1
figure 1

GPSAT phase transition. Ratio of satisfiable instance and average computation time for each number m of clauses in Γ, with n=50 and q=5

Fig. 2
figure 2

GPSAT phase transition. Ratio of satisfiable instance and average computation time for each number m of clauses in Γ, with n=50 and q=10

Fig. 3
figure 3

GPSAT phase transition. Ratio of satisfiable instance and average computation time for each number m of clauses in Γ, with n=50 and q=15

Fig. 4
figure 4

GPSAT phase transition. Ratio of satisfiable instance and average computation time for each number m of clauses in Γ, with n=100 and q=10

Fig. 5
figure 5

GPSAT phase transition. Ratio of satisfiable instance and average computation time for each number m of clauses in Γ, with n=100 and q=15

An easy-hard-easy pattern, typical from phase transition phenomena, is seen in Figs. 1, 2, 3, 4, and 5. The hardest instances were found close to the transition between regions of satisfiable and unsatisfiable instances; furthermore, unsatisfiable instances were found harder to solve than satisfiable ones. If q=0, the problem is Boolean satisfiability, and the point of approximately 50 % of satisfiable instances is around m/n=4.3 [12]. Here we add some probabilistic constraints, so the phase transition point has moved leftwards, as expected.

Comparing to the results in ref. [6] solving GPSAT seems harder than PSAT. This is not surprising, since it is known that disjunctive linear programming is considerably harder than linear programming.

Conclusions

In this paper, we have introduced the generalized probabilistic satisfiability problem, a normal form for it, and an algorithm to solve it. GPSAT is considerably more expressive than PSAT, allowing negation and disjunction of probabilistic assessments and probabilistic nesting as well, but it is still NP-complete. Based on an integer linear programming solution to PSAT, we introduced an integer linear programming solution for GPSAT. Evidence for phase transition was found in our initial experiments, but an exhaustive investigation is needed to confirm and understand this phenomenon in GPSAT and PSAT.

As future work, it would be interesting to enlarge our language to embed linear combinations of probabilistic assignments and conditional probabilities. Another path would be to generalize the coherence checking problem, as we did for PSAT, for de Finetti’s framework for probability.

Endnotes

1 ωϕ)=1 iff ω(ϕ)=0; ω(ϕ 1ϕ 2)=1 iff ω(ϕ 1)=1 and ω(ϕ 2)=1; ω(ϕ 1ϕ 2)=1 iff ω(p h i 1)=1 or ω(ϕ 2)=1; ω(ϕ 1ϕ 2)=1 iff ω(ϕ 1)=0 or ω(ϕ 2)=1.

2 For simplicity, we left the connective → outside the language, without loss of generality.

3 If a solution can be reached with less than q+1 valuations with positive probability, then some a j can be repeated with zero probability.

References

  1. Andersen KA, Pretolani D (2001) Easy cases of probabilistic satisfiability. Ann Math Artif Intell 33(1): 69–91.

    Article  MathSciNet  Google Scholar 

  2. Baioletti M, Capotorti A, Tiberi P, Tulipani S (2004) An empirical complexity for a 2CPA solver. 10th International Conference IPMU: 1857–1864.

  3. Boole G (1958) An Investigation on the Laws of Thought, Dover, New York.

  4. Bruno G, Gilio A (1980) Applicazione del metodo del simplesso al teorema fondamentale per le probabilità nella concezione soggettivistica. Statistica 40: 337–344.

    MathSciNet  Google Scholar 

  5. Chandru V, Hooker J (1999) Optimization Methods for Logical Inference, John Wiley & Sons Inc. ISBN: 978-0-471-57035-6.

  6. Cozman FG, di Ianni LF (2013) Probabilistic satisfiability and coherence checking through integer programming In: Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 145–156, The Netherlands.

  7. De Bona G, Cozman FG, Finger M (2014) Towards classifying propositional probabilistic logics. J Appl Logic 12(3): 349–362.

    Article  MathSciNet  MATH  Google Scholar 

  8. Eckhoff J (1993) Helly, Radon, and Carathéodory type theorems. In: Grüber P Wills J (eds)Handbook of convex geometry, 389–448, North-Holland, Amsterdam, 1993.

  9. Fagin R, Halpern JY, Megiddo N (1990) A logic for reasoning about probabilities. Inform Comput 87: 78–128.

    Article  MathSciNet  MATH  Google Scholar 

  10. Fagin R, Halpern JY (1994) Reasoning about knowledge and probability. J ACM 41(2): 340–367.

    Article  MathSciNet  MATH  Google Scholar 

  11. Finger M, De Bona G (2011) Probabilistic satisfiability: Logic-based algorithms and phase transition In: IJCAI, 528–533. doi:10.5591/978-1-57735-516-8/IJCAI11-096.

  12. Gent IP, Walsh T (1994) The SAT phase transition In: 11th European Conference on Artificial Intelligence, 105–109.

  13. Georgakopoulos G, Kavvadias D, Papadimitriou CH (1988) Probabilistic satisfiability. J Complexity 4: 1–11.

    Article  MathSciNet  MATH  Google Scholar 

  14. Hailperin T (1965) Best possible inequalities for the probability of a logical function of events. 72: 343–359.

  15. Hailperin T (1976) Boole’s Logic and Probability: a Critical Exposition from the Standpoint of Contemporary Algebra, Logic, and Probability Theory, North-Holland, Amsterdam.

  16. Halpern JY (2003) Reasoning about Uncertainty. MIT Press, Cambridge, Massachusetts.

    Google Scholar 

  17. Hansen P, Jaumard B (1996) Probabilistic Satisfiability. Technical Report G-96-31, Les Cahiers du GERAD, École Polytechique de Montréal.

  18. Hansen P, Perron S (2008) Merging the local and global approaches to probabilistic satisfiability. Int J Approximate Reasoning 47(2): 125–140.

    Article  MathSciNet  MATH  Google Scholar 

  19. Jaumard B, Hansen P, de Aragão MP (1991) Column generation methods for probabilistic logic. ORSA J Comput 3(2): 135–148.

    Article  MATH  Google Scholar 

  20. Lukasiewicz T (2008) Expressive probabilistic description logics. Artif Intell 172(6–7): 852–883.

    Article  MathSciNet  MATH  Google Scholar 

  21. Ng R, Subrahmanian VS (1992) Probabilistic logic programming. Inform Comput 101(2): 150–201.

    Article  MathSciNet  MATH  Google Scholar 

  22. Nilsson NJ (1986) Probabilistic logic. Artif Intell 28: 71–87.

    Article  MathSciNet  MATH  Google Scholar 

  23. Uchii S (1973) Higher order probabilities and coherence. Philos Sci 40(3): 373–381.

    Article  Google Scholar 

  24. Williams HP (2009) Logic and integer programming. Springer. ISBN 978-0-387-92280-5.

Download references

Acknowledgements

GDB is supported by CAPES. FGC is partially supported by CNPq. MF is partially supported by CNPq grant PQ 302553/2010-0.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Glauber De Bona.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

FGC suggested the idea, provided the basis for the implementation of the algorithm, and contributed to the text. GDB completed the implementation, ran the experiments, and contributed to the text; he is a PhD candidate under supervision of MF. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bona, G.D., Cozman, F.G. & Finger, M. Generalized probabilistic satisfiability through integer programming. J Braz Comput Soc 21, 11 (2015). https://doi.org/10.1186/s13173-015-0028-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13173-015-0028-x

Keywords