Tuesday, November 29, 2016

Entailment and Contradiction Detection by any other name

More Dick Crouch on semantics and names, joyously mangled by me.

The Entailment and Contradiction Detector (ECD) -- see the old patent http://www.google.com/patents/EP1852811A2?cl=en, implements a version of what has come to be known as "Natural Logic" (see http://nlp.stanford.edu/~wcmac/papers/natlog-wtep07.pdf). 

This performs inference by determining specificity relations between words, phrases, and sentences. For example, "dog" is more specific than "mammal"; "_ saw a dog" is consequently more specific than "_ saw a mammal"; and so "John saw a dog" is more specific than "John saw a mammal". 

In the normal course of events, more specific versions of sentences imply more general versions. So "John saw a dog" entails that "John saw a mammal", but not vice versa. However, the construction of sentences can affect specificity relations over and above the contribution made by individual words. For example, "dog" is a more specific term than "mammal", but "John likes mammals" is less  specific claim than "John likes dogs" (the second carves out a more restricted set of possibilities).  Similarly, "No person saw a dog" is more specific and entails "No man saw a dog", and is contradicted by "A man saw a dog."

Most approaches to using natural logic for textual inference either rely on very close alignments of semantic representations, or on heuristics operating on bags of words and word order. We pursue an intermediate path where we primarily match words at the individual level, but update those matches according to the syntactic/semantic structures in which those words occur.

In the previous post we saw that we can construct  graph objects as semantic representations.  Semantic graphs comprise four sub-graphs:

Role Graph: This represents the basic predicate argument structure of the sentence. Nodes in the graph correspond to skolem and context terms in the (non-graphical) semantic representation, and the arcs between them are semantic roles relations.
Within role graphs, two distinct kinds of node are distinguished: skolem nodes and context nodes. Skolem nodes are terms that are for the most part directly introduced by lexical items (nouns, verbs, adjectives, adverbs). Context nodes can sometimes additionally be introduced by words
and/or constructions. For example, the verb "believe" introduces both a skolem for the act of believing, and a hypothetical context in which the thing believed is situated. All sentences have a top level / true context, and other contexts are introduced by arcs linking them to skolem or contexts nodes in pre-existing contexts. All contexts should have a context_head arc that links the context to the main/head skolem defining the context. This relation is important for the proper treatment of negation. 

Property Graph: Some words (e.g. determiners) as well as morphological features (e.g. singular/plural, past/present tense) do not lead to the introduction of new skolem nodes. Instead they mark properties on existing skolems. The property relations are held in the property graph. Values of the properties are represented as value nodes and are typically atomic (string valued).

Lexical Graph: The lexical graph is used to records other lexical information associated with the skolems, typically derived from the semantic lexicon. Skolems in the role graph will have lexical edges linking them to zero or more sense nodes. The contents of the sense nodes include the sense identifier, and a list of concepts that the sense is attached to. (In the PARC incarnation of ECD the semantic lexicon was called the Unified Lexicon and it was homegrown. Other instantiations are possible.)

Link Graph: The link graph indicates identities between nodes, as induced by such things as (a) anaphora resolution, (b) copula, appositive and other constructions (Obama is the president; Obama, the president), and (c) lexical coreference links identifying lexical skolems with those introduced explicitly in the sentence (e.g. part of the lexical entry for "pilot" says that "pilots fly aircraft" -- the pilot mentioned in some sort of  "naive semantics" is lexically coreferent with the word that introduced it)

Theoretical Waffling for Semanticists

(This is a continuation of Graph Semantics and Examples, by Dick Crouch)

What is the difference between a graph-based semantic representation and a logical one? At one level, any logical formula can be represented in graphical form, so it may look as though the difference is just one of visualization preferences. But there is perhaps a deeper distinction. One of the startling innovations of Frege's rebooting of logic was to provide a notation where scope could be separated out from argument structure. Prior to this, the representation of "John loves Mary" was obvious:


but not so much so for "every man loves a woman"

love(every(man), a(woman))

Even if you can treat "every man" along the same lines as a name like John, how could you even begin to account for the scope ambiguity. Frege's trick was to separate out the introduction of the term from the point at which that term becomes an argument:

∀x. man(x) → ∃y. woman(y) & love(x,y)

However, it is not transparently obvious to untutored eyes which occurrences of the variables x and y serve to introduce and scope [man(x), woman(y)] and which serve to indicate argument structure [love(x,y)]. Both are woven into the same notation. The tripartite structures proposed by Barbara Partee make the distinction a little clearer, but at the cost of more complex and expressive notations.

The layered graphical representation is an attempt at a much cleaner separation of these concerns. Predicate-argument structure is represented as unadulterated predicate-argument structure, without recourse to variables. Scope and context is represented by a mapping onto a separate structure. Thus, amongst other things, it becomes trivially easy to extract just the predicate-argument structure from the representation. But there is still something of a promissory note here: to really make good on this claim, a way of representing quantifier scope by likening it to contextual scope is required. In practical terms this is perhaps not of the utmost priority, but theoretically it is.

Modality, Intensionality and Quantifiers
Most computer scientists coming to modal logic nowadays see it as a special case of first order logic (quantify over possible worlds). I am inclined to explore the opposite view, and see first order logic as a special case of modal logic, where scope-inducing quantifiers are both rare and act like modal operators. Here is an initial sketch of how this might work.

First, noun phrases refer to collections of objects. (Well, actually, following the Preventing Existence and Counting Contexts papers, noun phrases refer to concepts that have cardinality restrictions). For example, "three men" refers to a group object of men, with cardinality three; "every man" refers to a group object of men that includes all men in the domain of discourse; and "three boys ate five pizzas" means, under its most natural reading that there were three boys and there were five pizzas, and the pizzas got eaten by the boys. Under certain circumstances, though, the terms "three boys" or "five pizzas" can be interpreted distributively; e.g.. there was a collection of five boys, and each member of the collection ate three pizzas.

Distributive quantification is not yet implemented in the current system. But here are some hand-constructed examples of how this might work for "Every man loves a woman"

Distribution over "every man" introduces a new context, D(man), and the "a woman" can scope wide or narrow with respect to the distributive context.

 Now you may be excited or apprehensive about this double role played by variables in first-order logic. If you grew up with Martin-Loef's dependent quantifiers, you may find hard to think of the meaning where all the men in the domain love the same woman: the dependent quantifier "for each man, there's a woman" might seem far more reasonable. But of course you can still express both.

You can also think about the vast grey area between propositional logic and first-order logic, where you're allowed variables, but no quantifiers. Where you can have dependencies between these variables. or not.
Or the situations where you have variables and they are only existentially quantified; or universally so.
Where skolemization makes  sense or not. so many possibilities that only come to you when teaching "baby logic".

Monday, November 28, 2016

Semantic Graphs and Examples

You've got to admit it, graphs have an allure that terms do not have. From silly graphs like
to matters of life and death graphs like

people simply love the stuff. Terms, even the lambda-calculus ones, do not have such an appeal.

So it makes some sense to see if we can capitalize on graphs' affordances for Natural Language semantics, of the style we like.

This is a guest blog post by Dick Crouch. Nah, I lie. It's the mathematics of his stuff that I am trying to understand. There is a collection of slides in the archive of the Delphi-in meeting in Stanford, summer 2016 too (Standardizing Interface Representations for Downstream Tasks like Entailment or Reasoning), but the notes are older, from May 2015.

These are notes towards a proposal for a graphical semantic representation for natural language. Its main feature is that it layers a number sub-graphs over a basic sub-graph representing the predicate-argument structure of the representation. These sub-graphs include:

  • A context / scope sub-graph. This represents the structure of propositional contexts (approximately possible worlds) against which predicates and arguments are to be interpreted. This layer is used to handle boolean connectives like negation and disjunction, propositional attitude  and other clausal contexts (belief, knowledge, imperatives, questions, conditionals), and quantifier scope (under development). The predicate-argument and context graphs go hand in hand, and one cannot properly interpret a predicate-argument graph without its associated context graph.
  • A property sub-graph. This associates terms in the predicate-argument graph with lexical, morphological, and syntactic features (e.g. cardinality, tense and aspect morphology, specifiers)
  • A lexical sub-graph. This associates terms in the predicate-argument graph with lexical entries. There will can be more than one lexical sub-graph for each word, and it is populated by the  concepts, and semantic information obtainable from a knowledge base such as Princeton WordNet, for example.
  • A link sub-graph. This contains co-reference and discourse links between terms in the pred-arg graph. (It has also been used in entailment and contradiction detection to record term matches between premise and conclusion graphs)
  • Other sub-graphs are possible. A separate temporal sub-graph for spelling out the semantics of tense and aspect is under consideration.

This proposal has been partially implemented, and appears to have some practical utility. But theoretically it has not been fully fleshed out. These notes do not perform this fleshing out task, but just aim to describe some of the motivations and issues.

To give an initial idea of what these graphs are like, here are some  examples showing the basic predicate-argument and context structures for some simple sentences. The predicate-argument nodes are shown in blue, and the contexts in grey.

1. John did not sleep.

produces the graph above. All sentences are initially embedded under the true context (t) -- on the top right. However, the negation induces a new context embedded under t. In this negated context, an instance of the concept "sleeping by John" can be instantiated. But the effect of the "not" link between t and the embedded context means that this concept is held to be uninstantiable in t.

Every context will have a context-head (ctx_hd) link to a node in the predicate argument graph. The node in the predicate argument graph represents a lexical concept (possibly further restricted by syntactic arguments). The context head concept is always held to be instantiable in its corresponding context. But whether it continues to be instantiable in sub- or super-ordinate context depends on the kind of link between the contexts.

Not explicitly shown in this graph, but present in the actual graphs are further non-head links from each predicate-argument term to their introducing contexts.

If you want to relate this to Discourse Representation Structures, you can see the context labels as being the names of DRS boxes.

2. John believes that Mary does not like him.

This is a slightly more complex example where we can see that the word "believe" introduces an additional context for the complement clause "Mary does not like him". In the t context, there is a believing by John of something. What that something is is spelled out in the clausal context (ctx_5x1), which is a negation of the clausal context "Mary likes him". The example also show a co-reference link between the subject of believe and the object of like.

3. John or Mary slept.

This illustrates the treatment of disjunction. Like negation, disjunction is viewed as a context introducer (i.e. natural language disjunction is inherently modal / intensional, unlike disjunction in classical propositional or first-order logic). The way to read the graph is that there is some group object that is the subject of sleep. Both the group object and the sleeping by the group object are asserted to be instantiable in the top level context. The group object is further restricted by its membership properties: in one context John is an element of the object, and in another Mary is an element of the group object.

4. John loves Mary.

ok, I bet this one got you by surprise!

Just for the hell of it this time here is a fuller graph for a simpler sentence, showing the other lexical and property sub-graphs. The "lex" arcs point to possible word senses for the predicate-argument terms. Not shown in the diagram is that the labels on the sense nodes encode information about the  taxonomic concepts associated with the word senses. Likewise not illustrated in any of these graphs is the fact the predicate-argument node labels encode things like part of speech, stem and surface form, position in sentence, etc.
The way these graphs are  obtained is completely separable from, and less important than an abstract definition of semantic graph structures that allows one to specify how to process semantics in various ways (e.g. direct inference on graphs, conversion of graphs to representations suitable for theorem proving, etc.).

Maybe you think that the use of transfer semantics as above seems like  an overkill, at least for the purposes of providing inputs for natural language inference. The transfer semantic pipeline was originally set up to ease the conversion of linguistic semantic representations onto more canonical knowledge representations. As such, there is considerable emphasis on normalizing different semantic representations so that wherever possible the same content is represented in the same way: this simplifies conversion to KR. 

But  maybe there is no particular reason to do all this normalization on the inputs if all you wanted to do was inference. It might  be better to figure out a lighter-weight process for adding extra layers of semantic information direct to dependency structure produced by the parser. Much like many others are doing nowadays.

But the kinds of representations that make sense for inference, this is indeed something that it is worth thinking hard about.

Sunday, November 27, 2016

Goods and Services

This chart explains a lot. Toys really got much cheaper. And the absurd price of college fees and medical care is really true.
Shame it does not calculate (and I have never seen numbers for)  externalities. but then, how could we calculate them?

Tuesday, November 22, 2016

Why are Google and others making their AI tools open?

Roger K. Moore (husband's ex-boss) tweeted in May 2016:

Of course they're happy to release their tools, they're sitting on all the data needed to use them effectively

but of course this explains why they can do it, without bad consequences for themselves. 
It doesn't explain why they do it.
The obvious explanation is that perhaps some of the clever clogs around  will find newer means of making more money.

 "open source tools are arriving that can run on affordable hardware and allow individuals and small organizations to perform prodigious data crunching and predictive tasks" 

and separating fact from hype is every time harder.

Meanwhile it makes sense to pay attention to

How You Speak To Siri & Alexa Matters More Than You Think — Here's Why



Wednesday, November 9, 2016

Papers about Constructive Modal Logics

I am trying to organize my papers  by subject, so that I can have a clear view of stuff I still want to do. I got hold of all Google Scholar knows about my publications and made it into a bib file and was pleasantly surprised to  find 126 "things". I could be much worse, I say. 

For work in constructive modal logic, it all started when Gavin and I had a paper that he presented in Amsterdam, in a conference called "Logic at Work", I think. I believe this was in 1992. Because publications sometimes get completely out of whack, this only appeared in Studia Logica in 2000. For the time being, I  simply grab the list from Scholar, to mull it over.

1. Gavin M Bierman and V C V De Paiva. Intuitionistic necessity revisited. School of Computer Science Research Reports, University of Birmingham CSR, 1996. 
2. Neil Ghani, Valeria de Paiva, and Eike Ritter. Explicit substitutions for constructive necessity. In ICALP International Conference on Automata, Languages and Programming, 1998. 
3. P Nick Benton, Gavin M. Bierman, and Valeria CV de Paiva. Computational types from a logical perspective. Journal of Functional Programming, 8(02):177–193, 1998. 
4. Natasha Alechina, Valeria de Paiva, and Eike Ritter. Relating categorical and Kripke semantics for intuitionistic modal logics. In Proceedings of the Conference on Advances in Modal Logic (AIML’98), Uppsala, Sweden. CSLI, 1998.  (buggy, corrected in #7 below)
5. Gavin M. Bierman and Valeria C.V. de Paiva. On an intuitionistic modal logic. Studia Logica, 65(3):383–416, 2000. 
6. Gianluigi Bellin, Valeria de Paiva, and Eike Ritter. Extended Curry-Howard correspondence for a basic constructive modal logic. In Proceedings of Methods for Modalities, 2001. 
7. Natasha Alechina, Michael Mendler, Valeria de Paiva, and Eike Ritter. Categorical and Kripke semantics for constructive S4 modal logic. In  Computer Science Logic, pages 292–307. Springer Berlin Heidelberg, 2001. 
8. Valeria de Paiva. Natural deduction and context as (constructive) modality. In International and Interdisciplinary Conference on Modeling and Using Context, pages 116–129. Springer Berlin Heidelberg, 2003. 
9. Valeria de Paiva, Rajeev Gore, and Michael Mendler. Modalities in constructive logics and type theories. Journal of Logic and Computation, Special Issue, 2004. 
10. Michael Mendler and Valeria de Paiva. Constructive CK for contexts. In Proceedings of the first Workshop on Context Representation and Reasoning-CONTEXT05, Stanford, 2005. 
11. Valeria de Paiva. Constructive description logics: what, why and how. Context Representation and Reasoning, Riva del Garda, 2006. 
12. Valeria de Paiva and Eike Ritter. Basic constructive modality. Logic without frontiers: Festschrift for Walter Alexandre Carnielli on the occasion of his 60th birthday. College Publications, pages 411–428, 2011. 
13. Valeria de Paiva and Brigitte Pientka. Intuitionistic Modal Logic and Applications (IMLA 2008). Information and Computation, 209(12):1435–1436, 2011. 
14. Valeria de Paiva and Natasha Alechina. Contextual constructive description logics. ARCOE-11, page 16, 2011. 
15. Valeria de Paiva. Contexts for quantification. Proceedings of CommonSense, 2013. 
16. Valeria de Paiva, Mario Benevides, Vivek Nigam, and Elaine Pimentel. Proceedings of the 6th Workshop on Intuitionistic Modal Logic and Applications (IMLA 2013) in association with UNILOG 2013. Electronic Notes in Theoretical Computer Science, 300:1–104, 2014. 
17. Charles Stewart, Valeria de Paiva, and Natasha Alechina. Intuitionistic modal logic: A 15-year retrospective. Journal of Logic and Computation,  exv042, 2015. 

Remembering Grisha Mints

This is the text of a short eulogy that I delivered on the 2nd Sept 2014, in honour of Professor Grigori Mints. This was over skype for a session part of WOLLIC 2014 in Valparaiso, Chile.

Ruy de Queiroz, Grisha Mints and myself organized the volume from the WOLLIC meeting in 2006 at Stanford. We had many other projects that I would've liked to have  completed with Grisha, but that was not to be. I miss him lots!

(a somber mood today for many reasons).

Dear Colleagues,
It's with a heavy heart that I'm joining you over Skype to share one or two quick memories of Grisha Mints.
Grisha was a wonderful teacher and mentor, as I'm sure most of you know.

 Speaking in his seminar (the Logic Seminar in Stanford  has  always been his and Sol's seminar, for me at least), it was always kind of a bit nerve-wrecking as you had this feeling that things would always be checked, because what we do requires this kind of attention to detail in proofs, so you do need to know the details of what you were saying. so you're always (or at least I was) always  afraid you'd be found wanting in rigour or wanting on the quality of your responses or of your explanations,  but you knew you wouldn't be ignored... whether you're talking about baby logic or cutting edge stuff, rigor was and is necessary and expected.

Grisha's love was `tough love', as many of the tributes to him have shown. He was wonderful at asking people what they were doing and why they were doing it, but you had to be prepared to hear that it wasn't worth doing  and why so...

I always knew about this side of his personality, the strict Russian professor, who knew much and did not tolerate fools, as far the mathematics of what we do is concerned.  What I did not know much about, to begin with,  was this other social side of him as the friend, colleague and mentor always willing to give people a helping hand when he knew they needed it.. This is also coming up in all the tributes to Grisha, and I'm one of ones who can  say that when I needed help, (as the start-up I was working on disappeared as part of the big economic crises of 2008), he was there for me, offering me to lecture at Stanford, a sure way to make me more noticed and more employable.  Just the chat we had about it, was already a big boost for my morale, which was in a bad state.. Knowing that he thought I could teach one of his  classes in Stanford did help, as did the classes themselves as they were lots of  fun. For this, as well as everything else, I will always be grateful...
I am only sad that (when looking over all the gmail we exchanged in the last fifteen years) I saw lots of plans and good ideas, some purely mathematical ones, some more about getting other people interested in the subjects of Logic, Language, Computation that you guys are discussing in Valparaiso, that we didn't have the time to do together. I thought we had plenty of time, so many of the things that I would've done with someone else, I didn't do, because I thought they weren't good enough for Grisha. I hope to be able to do  some of this stuff in his honour, but I know that they will have to be  good, very good, as the master did not approve of mathematical frivolities. But this is another, a third important lesson I learned from him directly: do let your ideas fight with other ideas, have the courage of expressing yourself in the seminars and in the discussions. Our field is a difficult one, it's very mixed, you can always have the impression that you know much less than others in the room, but at least we can ask questions and all mathematical questions can be asked. if they are easy we answer quickly and move on, if they're too hard, we acknowledge it and move on to whatever we want to discuss today. Just ask the questions, he said.
Rest in peace Professor Mints, you are very much missed here!