Probabilistic Ontology: A Simple Example

Bayesian Networks

Bayesian networks have been successfully applied to create consistent probabilistic representations of uncertain knowledge in a wide range of applications.
Bayesian networks (BNs) provide a means of parsimoniously expressing joint probability distributions over many interrelated hypotheses. A Bayesian network consists of a directed acyclic graph (DAG) and a set of local distributions. Each node in the graph represents a random variable. A random variable denotes an attribute, feature, or set of hypotheses about which we may be uncertain. Each random variable has a set of mutually exclusive and collectively exhaustive possible values. That is, exactly one of the possible values is or will be the actual value, and we are uncertain about which one it is. The graph represents direct qualitative dependence relationships; the local distributions represent quantitative information about the strength of those dependencies. The graph and the local distributions together represent a joint probability distribution over the random variables denoted by the nodes of the graph. Figure 1 shows an example of a BN representing part of a highly simplified ontology for wines and pizzas. In this toy example, inspired by the wine and pizza ontologies, we assume that domain knowledge about gastronomy was gathered from sources such as statistical data collected from restaurants and expert judgment of sommeliers and pizzaiolos. The resulting knowledge base expresses a probability distribution relating features of the pizzas ordered by customers (i.e. type of base and topping) and characteristics of the wines ordered to accompany the pizzas. Figure 1a shows prior probability distributions based on the background information in the knowledge base. Figure 1b represents a situation in which a customer requests a pizza with cheese topping and a thin and crispy base. Using the probability distribution stored in the BN of Figure 1, the waiter can apply Bayes rule to infer the best type of wine to offer the customer given his pizza preferences and the body of statistical and expert information previously linking features of pizza to wines. A Bayesian network provides a parsimonious way to express the joint distribution and a computationally efficient way to implement Bayes rule. The result of Bayesian inference is shown in Figure 1b, where evidence of the customer's order points to Beaujolais as the most likely wine the customer would order, followed by Cabernet Sauvignon, and so on. We can see that knowledge of the customer's pizza choice has increased the likelihood of Beaujolais and decreased the likelihood of Chardonnay and Bordeaux.
Pizza & Wine BN
a: Prior Probabilities
Pizza and Wine BN with evidence
b: Posterior Probabilities Given Base and Topping
Figure 1: Bayesian Network for Pizza and Wine

Although this is just a toy example, it demonstrates how incomplete information about a domain can be used to improve our decisions. In an ontology without uncertainty, there would not be enough information for a logical reasoner to infer a good choice of wine to offer the customer, and the decision would have to be made without optimal use of all the information available.

BN for 3 Topping PizzaAs Bayesian networks have grown in popularity, their shortcomings in expressiveness for many real-world applications have become increasingly apparent. More specifically, Bayesian Networks assume a simple attribute-value representation – that is, each problem instance involves reasoning about the same fixed number of attributes, with only the evidence values changing from problem instance to problem instance. In the pizza and wine example, the PizzaTopping random variable conveys general information about the class of pizza toppings (i.e. types of toppings for a given pizza and how it is related to preferences over wine flavor and color), but the BN in Figures 1 and 2 is valid for pizzas with only one topping. To deal with more elaborate pizzas it is necessary to build specific BNs for each configuration, each one with a distinct probability distribution. For example, Figure 3 depicts a BN for a 3- topping pizza with a specific customer preference displayed. Also, the information conveyed by the BNs (i.e. for 1-topping, 2-toppings, etc.) relates to the class of pizza toppings, and not to specific instances of those classes. Therfore, the BN in Figure 3 cannot be used for a situation in which the costumer asks for two 3-topping pizzas. This type of representation is inadequate for many problems of practical importance. Similarly, these BNs cannot be used to reason about a situation in which a customer orders several bottles of wine that may be of different varieties. Many domains require reasoning about varying numbers of related entities of different types, where the numbers, types and relationships among entities usually cannot be specified in advance and may have uncertainty in their own definitions. For these types of problem, a more expressive probabilistic language is needed.

Multi-Entity Bayesian Networks

In recent years, languages have appeared that extend the expressiveness of probabilistic graphical models in various ways. This trend reflects the need for probabilistic tools with more representational power to meet the demands of real world problems. Here, we consider one such language, called Multi-Entity Bayesian Networks, or MEBN. MEBN represents probabilistic knowledge as a collection of Bayesian network fragments called MFrags. As an illustration of the expressiveness of a first-order probabilistic logic, Figure 3a presents a graphical depiction of MFrags for the wine and pizza toy example. It conveys both the structural relationships (implied by the arcs) among the nodes and the numerical probabilities (represented by the local distributions, and not depicted in the figure). The MFrags contain three kinds of random variables.
  1. Context random variables, shown in yellow, represent conditions that must be satisfied for the probability distributions encoded by the MFrags to apply. For example, the Pizza Base MFrag relates the flavor and body of a wine w the base of a pizza p. The context random variables specify that w must be a wine, p must be a pizza, and w must be served with p.
  2. Resident random variables represent random variables whose distributions are defined by the MFrag. The parents of a resident random variable are the random variables with arcs pointing into it. The MFrag assigns a probability distribution to the resident random variable for each combination of values for its parents.
  3. Input random variables, shown in gray, represent random variables which influence the random variables in the MFrag (i.e., have arcs emanating from them into resident random variables) but whose distributions are defined in some other MFrag.
Filling in the placeholders w, p, and t with specific entities of type Wine, Pizza and Topping results in a situation-specific Bayesian network, or SSBN. Figure 3b shows a SSBN for a three-topping pizza. Notice that this is the same Bayesian network as Figure 2, except for a renaming of the nodes. This example shows that an expressive language like MEBN allows us to represent a wide variety of specific situations involving different numbers of wines, pizzas, and toppings.

Pizza & Wine MFrags
a: MFrags for Wine and Pizza Example
SSBN for Wine and Pizza Example
b: SSBN for 3-Topping Pizza
Figure 3: MEBN Representation of Pizzion of Wine and Pizza Example

Of course, this example is oversimplified, and does not represent many of the relationships we would want to consider in a more realistic problem. However, it suffices to illustrate how ontologies can be combined with Bayesian to represent and reason with uncertainty.

MEBN Semantics

There are three kinds of random variables in MEBN: logical random variables, non-logical random variables, and finding random variables. Logical random variables correspond to predicates in first-order logic, and non-logical random variables correspond to functions in first-order logic. The logical random variables have possible values in the set {T, F, ⊥}, and the non-logical random variables take on values in the set Ω∪{⊥}, where T is the value assigned to a logical statement that is true; F is the value assigned to a logical statement that is false; Ω is a countable set of distinct entity identifiers; and ⊥ is a special value denoting a meaningless statement. There are special logical random variables corresponding to the usual logical connectives and quantified statements. Finding random variables are used to represent evidence about particular situations, such as the toppings a specific customer has ordered.

A MEBN Theory consists of a set of MFrags satisfying a set of consistency constraints ensuring the existence of a joint probability distribution on the possible values of its random variables. A MEBN theory represents a probability distribution on interpretations of an associated first-order logic theory in the set Ω of entity identifiers. We can construct a probability distribution on interpretations of the theory in any domain Δ by mapping the entity identifiers to elements of Δ.

PR-OWL

To represent an MFrag, we need to specify:
PR-OWL is an upper ontology for building probabilistic ontologies based on MEBN logic. The MFrags depicted in Figure 4 form a consistent set that can be used to reason probabilistically about the wine-and-pizza domain. These MFrags can be stored in an OWL file using the classes and properties defined in the PR-OWL upper ontology. The MFrags can be used to instantiate situation specific Bayesian networks to answer queries about the domain of application being modeled. In other words, a PR-OWL probabilistic ontology consists of both deterministic and probabilistic information about the domain of discussion (e.g. wines and pizzas). This information is stored in an OWL file and can be used for answering specific queries for any configuration of the instances given the evidence at hand.
_______________________________________
©2007 Paulo C. G. Costa and Kathryn B. Laskey