As explained in the previous post, Structural Semantics is the branch of Linguistics which studies the meaning of functional expressions, such as logical connectors, quantifiers, referring terms, and auxiliary verbs, as well as how these meanings combine according to the syntactic structure of a sentence to obtain the structural meaning of the text. That’s the literal meaning of the text, abstracted from specific concepts of non-functional terms. It does not include lexical semantics, pragmatics, or world knowledge.
So why is knowledge of Structural Semantics essential for NLU applications? That’s because these applications require the computer to have the ability to calculate the exact truth conditions of a text, namely the conditions that must be satisfied in the world for the text to be true about the world. This can only be achieved if the computer precisely understands the meaning of functional terms, and how they determine the logical combinations of meaning parts of the sentence. We’ll demonstrate here a few cases. This post is an adaptation of sections 1.1 and 1.2 of my dissertation.
Exact NLU Applications
In some NLU tasks, complete accuracy is all-important. They require exact calculation of NL meanings and consequences, because results that are “almost correct” (“only slightly wrong”) are not good enough. Many of these tasks have exactly one correct answer that everyone agrees on. The NL input texts were originally written with a back-end unambiguous formalization in mind, which is intended to be fed into a rigorously-defined mathematical procedure for solving some type of problem. Thus, although in the general case of NLU, there may be no guarantee that the NL input can be precisely formalized or that the task has a precise result that all humans could agree upon, these things are in fact guaranteed by definition in the context of exact NLU tasks.
I will now give a few examples of such exact NLU applications: Word problems, such as logic puzzles, math questions, and science problems, as well as NL interfaces to databases and knowledge bases. More broadly, knowledge of structural semantics is useful also for related applications, such as understanding technical manuals, regulation and legal texts, question answering, automatic knowledge acquisition from texts, and more.
As a first example, consider the following Logic Puzzle text (only the first two questions are shown):
Six sculptures -- C, D, E, F, G, and H -- are to be exhibited in rooms 1, 2, and 3 of an art gallery. * Sculptures C and E may not be exhibited in the same room. * Sculptures D and G must be exhibited in the same room. * If sculptures E and F are exhibited in the same room, no other sculpture may be exhibited in that room. * At least one sculpture must be exhibited in each room, and no more than three sculptures may be exhibited in any room. 1. If sculpture G is exhibited in room 2 and sculpture E is exhibited in room 3, which of the following MUST be true? (A) Sculpture C is exhibited in room 1. (B) No more than two sculptures are exhibited in room 3. (C) Sculptures F and H are exhibited in the same room. (D) Three sculptures are exhibited in room 2. (E) Sculpture H is exhibited in room 3. 2. If sculptures C and G are exhibited in room 1, which of the following may NOT be a complete list of the sculpture(s) exhibited in room 2? (A) Sculpture D (B) Sculpture E (C) Sculpture F (D) Sculptures E and H (E) Sculptures F and H Source: Karl Weber, "The Unofficial Guide to the GRE Test", 2000 edition, ARCO Publishing.
A computer system can correctly understand and solve such textual logic puzzles only if it uses a very precise and comprehensive knowledge of functional expressions and structural semantics, including: logical operators (“if”, “and”), quantifiers (“no more than three”), modals (“must”, “may”), referring expressions (“the same”, “other”), and so on.
Of course, for a human, understanding such a text is the easy part, while solving the puzzle is the hard part, because we humans cannot quickly entertain many possibilities and constraints in our heads (that’s why such puzzles appear in aptitude tests such as GRE and LSAT). For a computer, it’s the reverse. Once the puzzle is formalized, solving it automatically is the easy part, as it can be done using very simple procedures (it’s a small constraint-satisfaction problem). The hard part is explaining to the computer how to understand the NL text, namely how to translate it to a precise logical formalization. E.g. the third constraint in the puzzle above should be formalized using something like this FOL formula:
∀x. [(Room(x) ∧ Exhibited_in(e,x) ∧ Exhibited_in(f,x)) → ¬∃y. [Sculpture(y) ∧ y≠e ∧ y≠f ∧ Exhibited_in(y,x)]]
The nice thing about this NLU task is that, unlike general NL texts, logic puzzle texts rely on very little lexical knowledge and world knowledge. Almost all the required knowledge is morphology, syntax, and structural semantics, and most of the required information for answering the questions is explicitly given in the text. Also, logic puzzles require a very accurate understanding and merging of all the logical constraints expressed throughout the text, while even a small misunderstanding of one sentence usually leads to completely wrong answers. Therefore, this task is an excellent vehicle for developing the body of structural semantics knowledge. This is the reason why I am working on a project whose aim is to develop a system that can automatically solve textual logic puzzles. For more on that, see: Automatic Understanding of Natural Language Logic Puzzles as well as Logic Puzzles: A New Test-Suite for Compositional Semantics and Reasoning.
STEM Textual Problems
There are other types of STEM textual problems that can be understood and solved by a computer system only if it uses accurate structural semantics. One example is math problems such as these:
1. Ginger wanted to see how much she spent on lunch daily, over the course of an average work-week. On Monday and Thursday, she spent $5.43 total. On Tuesday and Wednesday, she spent $3.54 on each day. On Friday, she spent $7.89 on lunch. What was her average daily cost? (A) $3.19 (B) $3.75 ... 2. During a 5-day festival, the number of visitors tripled each day. If the festival opened on a Thursday with 345 visitors, what was the number of visitors on that Sunday? (A) 345 (B) 1,035 ... Adapted from: http://www.testprepreview.com
Although solving such questions requires some domain knowledge of math, this knowledge is simple enough that it does not pose a large AI problem.
More complex textual problems are about other areas of science such as physics, chemistry, biology, etc. These of course require not only an understanding of the language, but also precisely formalized knowledge of the relevant scientific domain. See for example Project HALO that was able to get good scores on AP science tests. In this project, the translation from the original NL texts (both the domain knowledge and the questions) was formalized using mostly a manual process. This research project’s focus was the knowledge representation and reasoning aspect of the system, but a natural next step would be to create a computational ability for understanding the NL texts directly, and automatically translating them to the knowledge representations.
NL Interfaces to Databases and Knowledge Bases
Instead of querying a database using a formal language like SQL, it would be very useful
for users if the computer could understand NL questions and commands such as:
1. Which department has the largest number of employees? 2. How many employees were paid a salary higher than $70,000 over the last two years? 3. Did at least two supervisors attend every repair session? 4. Update: John Smith works in the personnel department.
Such queries and commands require understanding comparative and superlative
constructions (“higher than”, “largest”). Also, they require awareness of ambiguities pertaining to which operator has dominance over another (this is called “scope ambiguity” in Structural Semantics). E.g. in question 3 above, does the question ask whether there are at least two supervisors such that each of them attended every repair session? Or does it ask whether for each repair session, there were at least two supervisors that attended that session (not necessarily the same pair of supervisors for all sessions). Both readings should be considered by the NL interface, and if the ambiguity cannot be resolved due to lack of data or knowledge, the interface should ask the user a clarifying question rather than guess what she meant (just as a compiler could flag an unresolvable ambiguity of a polymorphic operator in a particular context of use).
A similar but more complex application is NL interfaces to Knowledge Bases (KBs). There are many KBs that were designed to model a particular domain such as a medical domain or intelligence analysis. Existing interfaces to such KBs usually involve rigid forms, which can only address simple queries, or complex formal languages, which require expertise on behalf of the user. It would be helpful to connect the formalized knowledge with a NL front-end. This is more complex than NLIDBs because chains of inference may be involved.
Technical texts and manuals that contain specifications for the operation of technical systems need to be written clearly and unambiguously so that they can be understood precisely by human readers. Therefore, they are much less vague than general NL texts, and automatically understanding them is a good example of an exact NLU application. Structural Semantics plays an important role here.
In fact, some places have taken an approach where instead of trying to write accurately and unambiguously using general NL, the authors of the texts first defined a restricted subset of the natural language which is simpler and much less ambiguous or vague compared to general NL. Such a restricted subset is called a Controlled Natural Language.
A notable example is ASD Simplified Technical English. This controlled language was developed to facilitate writing technical manuals for the aerospace and defense industries in such a way that they would be unambiguous. This helps foreign speakers of English to better understand the technical texts, and the precise definition can support tools for verifying that the texts are written in the controlled language, such as Boeing’s Simplified English Checker.
When taking this approach to the extreme, the controlled language can be delineated so precisely that it effectively becomes a formal language which nevertheless looks (almost) like NL. The motivation is that such a language maintains the complete precision and predictability of a formal language while making it easier and more natural for humans to learn and read it (compared to a formal language). Here is an example text written in Attempto Controlled English (ACE), specifying the operation of an ATM:
Every customer has at least 2 cards and their associated codes. If a customer C approaches an automatic teller and she inserts her own card that is valid carefully into the slot and types the correct code of the card then the automatic teller accepts the card and displays a message "Card accepted" and C is happy. No card that does not have a correct code is accepted. It is not the case that a customer’s card is valid, and is expired or is cancelled. Source: http://attempto.ifi.uzh.ch/site/talks/
This text is quite complex and sounds almost completely natural. The ACE engine can automatically translate such texts to First-Order Logic (FOL) formulas that faithfully capture the meaning (truth conditions) of the texts. How is that possible? First, the author of such a text has a precise FOL formalization in mind. Second, the allowed NL input is restricted to avoid imprecision. Potential ambiguities are handled by a defined rule: When the computer encounters a NL construction which can be ambiguous in general, a specific one of its possible interpretations is always selected, and the definition of the controlled language declares in advance which one it is. For example, in a sentence such as “I saw the girl with the telescope”, where there is a syntactic ambiguity (does the girl hold the telescope, or did I use the telescope to see her?), the language specification dictates that the attachment is done to the preceding noun phrase (“the girl”) rather than the verb (“saw”). A user who wants to convey the alternative choice would need to use a paraphrase with a different syntactic structure, or to use some artificial means such as a comma. This practice makes the NL input at times a little less than completely natural, but it is an effective compromise.
Regulations and Legal Texts
Another similar real-world application is understanding regulatory texts, such as a website that describes which collection of courses a college student must take in order to fulfill the requirements of a study program. Here is an example:
A candidate is required to complete a program of 45 units. At least 36 of these must be graded units, passed with an average 3.0 (B) grade point average (GPA) or higher. The 45 units may include no more than 21 units of courses from those listed below in Requirements 1 and 2. Source: http://cs.stanford.edu/Degrees/mscs/ [Jan 2007]
As with logic puzzles, regulatory texts describe general rules and conditions that must be met. The computer should be able to answer questions about these rules as well as questions based on an additional description of a particular real or hypothetical situation (e.g. check whether the set of courses a student has taken conforms to the regulations). Also similarly to logic puzzles, answers to such questions rarely if ever appear explicitly in the text, and must be inferred from it.
While many legal texts cannot be formalized precisely because they are written to be deliberately vague, understanding their literal meaning is an important first step towards automatically formalizing them so that computers can reason about their content. This is important in the field of Computational Law.
Most existing Question Answering systems focus on answering quite simple who-did-what-to-whom questions whose answers appear explicitly in the text. But in order to answer questions that are more linguistically complex, or that require combining various pieces of information that appear throughout the text and employing inference on them, precise semantic representations based on structural semantics are required. Consider this question:
The third president of the United States who got re-elected - was he a Democrat or a Republican?
It is unlikely that the answer appears explicitly in some text as “The third president of the United States who got re-elected was a Democrat” or some variation of it. Rather, the computer must understand what third means and how it combines with the meaning of the relative clause “who got re-elected” to constrain the choice of president, as well as understand the meaning of the connective “or”. It also has to calculate the answer by combining various pieces of information conveyed throughout different texts.
Recognizing Textual Entailment
Another task which is related to Question Answering, and which has additional practical applications such as Information Extraction and Summarization, is called Recognizing Textual Entailment. Here, instead of a text and a question about it, the computer needs to determine whether a “hypothesis” text follows from (is entailed by) another given “background” text. Entailment here is not the same as pure logical inference, it is in fact more relaxed: “B entails H” if, typically, a human reading B would infer that H is most likely true. Knowledge of structural semantics is required for many RTE-like questions, as the following examples demonstrate.
In pair 1 below, the computer needs to know the meaning of “more than” (H would not follow from B if “80 kilometers” were replaced with “120 kilometers”). In pair 2, it needs to know how to instantiate a statement quantified by “each” (here, with the specific year 2005), as well as to know about implications between numeric quantifiers. Answering 3 correctly requires understanding hypothetical conditionals and modality.
1. B: In any case, the fact that this week Michael Melvill, a 63-year-old civilian pilot, guided a tiny rocket-ship more than 100 kilometers above the Earth and then glided it safely back to Earth, is a cause for celebration. H: A tiny rocket-ship was guided more than 80 kilometers above the Earth. 2. B: Each year, 26,000 people are killed or mutilated by landmines of which 8,000 are children. H: In 2005, at least two thousand children were injured by landmines. 3. B: Things would be different if Microsoft was located in Georgia. H: Microsoft's corporate headquarters are not located in Georgia. Adapted from the first RTE test-suite.
Automatic Knowledge Acquisition from Texts
The more structural semantic knowledge a computer has, the better able it is to automatically acquire knowledge from crawling texts on the internet. Possessing only knowledge of syntax allows only rudimentary acquisition of simple patterns of facts that appear explicitly in the text. Having more sophisticated semantic representations and inference could allow the computer to combine separate pieces of information that appear throughout the texts. This is precisely the utility of semantic representations, that they capture the content of a text, and make it possible to relate the pieces of information to each other, merge them, compute entailments between them, etc. These are not possible if one relies only on the form, i.e. syntax, of the texts.
In this post, I surveyed various kinds of NLU applications that require knowledge of Structural Semantics, or could at least greatly benefit from it. In particular, I mentioned several applications of Exact NLU, where not compromising precision for breadth of coverage is essential for system reliability and usability. Here is a relevant quote from Popescu et al., 2003:
[NL interfaces to databases] are only usable if they map NL questions to SQL queries correctly. … People are unwilling to trade reliable and predictable user interfaces for intelligent but unreliable ones. …To satisfy users, NLIs can only misinterpret their questions very rarely if at all. Imagine a mouse that appropriately responds to a ‘click’ most of the time, but periodically whisks the user to an apparently random location. We posit that users would have an even worse reaction if they were told that a restaurant was open on a Sunday, but it turned out to be closed. If the NLI does not understand a user, it can indicate so and attempt to engage in a clarification dialog, but to actively misunderstand the user, form an inappropriate SQL query, and provide the user with an incorrect answer, would erode the user’s trust and render the NLI unusable.
Exact NLU tasks are a special subset of the space of all NLU applications, but if we can figure out how to do them, we will have learned valuable lessons and developed useful tools that could help us improve the accuracy of other NLP applications as well, in which less-than-perfect levels of accuracy are acceptable.
To sum up, structural semantic knowledge is largely domain-independent, and so it possesses a level of generality higher than other kinds of knowledge. Thus, once it is developed, it could help improve the accuracy of many NLU applications. It could be used for different purposes with little customization (the main customization might be adding frequencies of various phenomena, which may differ across domains). Of all the kinds of semantic knowledge needed in a sophisticated NL understanding system (including lexical knowledge and world-knowledge), the body of structural semantic knowledge has the smallest size. Therefore, we have a good chance to capture all or almost all of it through a concentrated collaborative effort in a reasonable amount of time. For such a research project, see Automatic Understanding of Natural Language Logic Puzzles.