You've got to admit it, graphs have an allure that terms do not have. From silly graphs like
to matters of life and death graphs like
people simply love the stuff. Terms, even the lambda-calculus ones, do not have such an appeal.
So it makes some sense to see if we can capitalize on graphs' affordances for Natural Language semantics, of the style we like.
This is a guest blog post by Dick Crouch. Nah, I lie. It's the mathematics of his stuff that I am trying to understand. There is a collection of slides in the archive of the Delphi-in meeting in Stanford, summer 2016 too (Standardizing Interface Representations for Downstream Tasks like Entailment or Reasoning), but the notes are older, from May 2015.
produces the graph above. All sentences are initially embedded under the true context (t) -- on the top right.
However, the negation induces a new context embedded under t. In this
negated context, an instance of the concept "sleeping by John" can be
instantiated. But the effect of the "not" link between t and the
embedded context means that this concept is held to be uninstantiable
in t.
Every context
will have a context-head (ctx_hd) link to a node in the predicate
argument graph. The node in the predicate argument graph represents a
lexical concept (possibly further restricted by syntactic arguments). The
context head concept is always held to be instantiable in its
corresponding context. But whether it continues to be instantiable in
sub- or super-ordinate context depends on the kind of link between the
contexts.
Not
explicitly shown in this graph, but present in the actual
graphs are further non-head links from each predicate-argument term to
their introducing contexts.
If
you want to relate this to Discourse Representation Structures, you can
see the context labels as being the names of DRS boxes.
This is a slightly more complex example where we can see that the word
"believe" introduces an additional context for the complement clause
"Mary does not like him". In the t context, there is a believing by John
of something. What that something is is spelled out in the clausal
context (ctx_5x1), which is a negation of the clausal context "Mary
likes him". The example also show a co-reference link between the
subject of believe and the object of like.
3. John or Mary slept.
This illustrates the treatment of disjunction. Like negation,
disjunction is viewed as a context introducer (i.e. natural language
disjunction is inherently modal / intensional, unlike disjunction in
classical propositional or first-order logic). The way to read the graph
is that there is some group object that is the subject of sleep. Both
the group object and the sleeping by the group object are asserted to be
instantiable in the top level context. The group object is further
restricted by its membership properties: in one context John is an
element of the object, and in another Mary is an element of the group
object.
4. John loves Mary.
ok, I bet this one got you by surprise!
Just for the hell of it this time here is a fuller graph for a simpler sentence, showing the other lexical and property sub-graphs. The "lex" arcs point to possible word senses for the predicate-argument terms. Not shown in the diagram is that the labels on the sense nodes encode information about the taxonomic concepts associated with the word senses. Likewise not illustrated in any of these graphs is the fact the predicate-argument node labels encode things like part of speech, stem and surface form, position in sentence, etc.
to matters of life and death graphs like
people simply love the stuff. Terms, even the lambda-calculus ones, do not have such an appeal.
So it makes some sense to see if we can capitalize on graphs' affordances for Natural Language semantics, of the style we like.
This is a guest blog post by Dick Crouch. Nah, I lie. It's the mathematics of his stuff that I am trying to understand. There is a collection of slides in the archive of the Delphi-in meeting in Stanford, summer 2016 too (Standardizing Interface Representations for Downstream Tasks like Entailment or Reasoning), but the notes are older, from May 2015.
These
are notes towards a proposal for a graphical semantic representation for
natural language. Its main feature is that it layers a number sub-graphs over a
basic sub-graph representing the predicate-argument structure of the
representation. These sub-graphs include:
- A context / scope sub-graph. This represents the structure of propositional contexts (approximately possible worlds) against which predicates and arguments are to be interpreted. This layer is used to handle boolean connectives like negation and disjunction, propositional attitude and other clausal contexts (belief, knowledge, imperatives, questions, conditionals), and quantifier scope (under development). The predicate-argument and context graphs go hand in hand, and one cannot properly interpret a predicate-argument graph without its associated context graph.
- A property sub-graph. This associates terms in the predicate-argument graph with lexical, morphological, and syntactic features (e.g. cardinality, tense and aspect morphology, specifiers)
- A lexical sub-graph. This associates terms in the predicate-argument graph with lexical entries. There will can be more than one lexical sub-graph for each word, and it is populated by the concepts, and semantic information obtainable from a knowledge base such as Princeton WordNet, for example.
- A link sub-graph. This contains co-reference and discourse links between terms in the pred-arg graph. (It has also been used in entailment and contradiction detection to record term matches between premise and conclusion graphs)
- Other sub-graphs are possible. A separate temporal sub-graph for spelling out the semantics of tense and aspect is under consideration.
This
proposal has been partially implemented, and appears to have some practical
utility. But theoretically it has not been fully fleshed out. These notes do
not perform this fleshing out task, but just aim to describe some of the
motivations and issues.
To
give an initial idea of what these graphs are like, here are some examples showing the basic predicate-argument and context
structures for some simple sentences. The predicate-argument nodes are shown in
blue, and the contexts in grey.
1.
John did not sleep.
2. John believes that Mary does not like him.
3. John or Mary slept.
4. John loves Mary.
ok, I bet this one got you by surprise!
Just for the hell of it this time here is a fuller graph for a simpler sentence, showing the other lexical and property sub-graphs. The "lex" arcs point to possible word senses for the predicate-argument terms. Not shown in the diagram is that the labels on the sense nodes encode information about the taxonomic concepts associated with the word senses. Likewise not illustrated in any of these graphs is the fact the predicate-argument node labels encode things like part of speech, stem and surface form, position in sentence, etc.
The way these graphs are obtained is completely separable from, and less
important than an abstract definition of semantic graph structures that
allows one to specify how to process semantics in various ways (e.g.
direct inference on graphs, conversion of graphs to representations
suitable for theorem proving, etc.).
Maybe you think that the use of transfer semantics as above seems like an overkill, at least for the purposes of providing inputs for natural language inference. The transfer semantic pipeline was originally set up to ease the
conversion of linguistic semantic representations onto more canonical
knowledge representations. As such, there is considerable emphasis on
normalizing different semantic representations so that wherever possible
the same content is represented in the same way: this simplifies
conversion to KR.
But maybe there is no particular reason to do all this normalization on the inputs if all you wanted to do was inference. It might be better to figure out a lighter-weight
process for adding extra layers of semantic information direct to
dependency structure produced by the parser. Much like many others are doing nowadays.
But the kinds of representations that make sense for inference, this is indeed something that it is worth thinking hard about.