Tuesday, July 14, 2015

A very old post from May 2013: Learning to love stats

 Last Friday I went to the Symposium described below at CSLI, Stanford. This was very interesting, but I haven't had a chance to digest the contents, properly. Since there was no webpage and there are no slides/papers from the talks, I'm posting the program here, as an 'aide-memoire'.

The main reason why I'm interested is obvious: probabilistic vector space semantics makes a lot of sense as a substitute for what semanticists call the Prime Semantics of Natural Language (origin of the infamous joke: What's the meaning of life?  life prime) but the probabilistic approach doesn't seem to scale so well as far as logical phenomena is concerned: antonyms seem to appear in similar contexts, not in opposite ones; a tiny small word like "not" seems to have a huge effect in meaning; concepts can crisply imply others, whether or not the probablities are similar, etc... One of the ideas I had during the meeting was that maybe what one needs to do is to use some sort of Glue Semantics logical system, with two-tiers: one where we do composition of meanings using say implicational linear logic and one where we do use the vector spaces for the meanings of the constituents themselves, ie nourn phrases, verb phrases and propositional phrases.


The 2012-2013 Cognition & Language Workshop is pleased to announce a
Symposium on Compositional Vector Space Semantics
featuring
   Stephen Clark, Chung-chieh Shan and Richard Socher

  The emerging field of compositional probabilistic vector space semantics for natural languages and other symbolic systems is being approached from multiple perspectives: language, cognition, and engineering. This symposium aims to promote fruitful discussions of interactions between approaches, with the goal of increasing collaboration and integration.

Schedule of Events:
   9:00 - 9:30      Light breakfast
9:30 - 11:00      Chung-chieh Shan (Indiana)
                        From Language Models to Distributional Semantics
                        Discussant: Noah Goodman

11:15 - 12:45      Richard Socher (Stanford)
                         Recursive Deep Learning for Modeling Semantic Compositionality
                         Discussant: Thomas Icard
  
12:45 - 2:00      Lunch
2:00 - 3:30      Stephen Clark (Cambridge)
                      A Mathematical Framework for a Compositional Distributional Model of Meaning
                      Discussant: Stanley Peters  

3:45 - 5:00      Breakout Groups and Discussion 
5:00 -       Snacks & Beverages


Chung-chieh Shan, University of Indiana
Title: From Language Models to Distributional Semantics

Abstract: Distributional semantics represents what an expression means as a vector that summarizes the contexts where it occurs.  This approach has successfully extracted semantic relations such as similarity and entailment from large corpora.  However, it remains unclear how to take advantage of syntactic structure, pragmatic context, and multiple information sources to overcome data sparsity.  These issues also confront language models used for statistical parsing, machine translation, and text compression. 
Thus, we seek guidance by converting language models into distributional semantics.  We propose to convert any probability distribution over expressions into a denotational semantics in which each phrase denotes a distribution over contexts.  Exploratory data analysis led us to hypothesize that the more accurate the expression distribution is, the more accurate the distributional semantics tends to be.  We tested this hypothesis on two expression distributions that can be estimated using a tiny corpus: a bag-of-words model, and a lexicalized probabilistic context-free grammar a la Collins.
  
Richard Socher, Stanford University
Title: Recursive Deep Learning for Modeling Semantic Compositionality
 
Abstract: Compositional and recursive structure is commonly found in different modalities, including natural language sentences and scene images. I will introduce several recursive deep learning models that, unlike standard deep learning methods can learn compositional meaning vector representations for phrases, sentences and images. These recursive neural network based models obtain state-of-the-art performance on a variety of syntactic and semantic language tasks such as parsing, paraphrase detection, relation classification and sentiment analysis.
   Besides the good performance, the models capture interesting phenomena in language such as compositionality. For instance the models learn different types of high level negation and how it can change the meaning of longer phrases with many positive words. They can learn that the sentiment following a "but" usually dominates that of phrases preceding the "but."Furthermore, unlike many other machine learning approaches that rely on human designed feature sets, features are learned as part of the model.
 
Stephen Clark, University of Cambridge
Title: A Mathematical Framework for a Compositional Distributional Model of Meaning
  
Abstract: In this talk I will describe a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types (based on categorial grammar). A key idea is that the meanings of functional words, such as verbs and adjectives, will be represented using tensors of various types. This mathematical framework enables us to compute the distributional meaning of a well-typed sentence from the distributional meanings of its constituents. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model.  
There are two key questions that the framework leaves open: 1) what are the basis vectors of the sentence space? and 2) how can the values in the tensors be acquired? I will sketch some of the ideas we have for how to answer these questions.

  

3 comments:

  1. Have you seen the work of Mehrnoosh Sadrzadeh? Roughly, her idea is to start with Lambek grammar, and then to use vector spaces as a concrete model for them. This gives a compositional way of extending bag-of-words semantics to include support for grammar and hence semantics of sentences.

    I'm only a bystander in the area, but I thought this was super clever!

    ReplyDelete
    Replies
    1. Oh, never mind -- looking more closely at the abstracts I see you must obviously already know about this.

      Delete
    2. Thanks for the comment Neel, indeed I do know about Mehrnoosh Sadrzadeh's work, but as usual I don't know enough about it. I need ot try to make better bridges and to see if I can improve it in the direction I want, i.e. not compact closed, but simply *-autonomous.

      Delete