Tuesday, October 19, 2021

Conversations with Sophie

One of the pleasures of going to Topos on Mondays is the conversations with Sophie in the car journey. Some have been quite revealing to me, as I do believe I think better when talking, then when simply thinking. When talking I need to finish thoughts, complete sentences, all these boring trifles that actually help you make sense. I know, it sounds strange, but rings true to me.

So I thought I'd record some of our conversations--or perhaps a cleaned up version of them. This starting one could be called "Lies Math Teachers Tell us", as it's a list of things that are wrong, but many people in maths departments (or coming out of them) believe to be true. The ones I'm about to discuss are mostly about Natural Language, which contrary to widespread belief (cf.  Emily Bender's Rule) is *not* a synonym for English.

The list goes as follows, for the time being:

1. Natural language is not TOO HARD to be modelled mathematically. We have plenty of models, some better than others. Natural language semantics is all about it and many researchers spent decades producing a body of literature that shows the difficulties, but also the progress that has been made.

Of course there's no point in all mathematicians trying to do it. As any other application of mathematics, some people like it, some don't. But saying that it cannot be done, is plainly wrong.

2. Just because I say it's possible to do it, it doesn't mean that I think it's easy.

Ambiguity is not a bug of Natural Language, it's a feature. A feature that evolution has been working on for thousands of years. This is one of the features that makes the subject difficult to model, but it's also why it's so fascinating: we can do wonderful things with words. And all so easily that we take it for granted. Like breathing, we only noticed we're doing it, when somehow the mechanism has problems.

 Finesse and sophistication are necessary. The exercises we give in the logic books about formalization of sentences, won't cut it in real life. We don't communicate in sentences used to teach 5-year-old kids  how to read (that is, unless this is what we're doing). We understand these sentences too, mostly, but the effort to create  a formal model of the language needs to be in the direction of all kinds of sentences: from the abstruse Law contract constructions to slang on Twitter. 

This suggests two different implications.

3.  We need to deal with intensional phenomena. It's no good saying, I can ditch, say,  attitude predicates, and only add them as a bonus feature later on. As we argued, as a collective (rdc+)  in `Entailment, Intensionality and Text Understanding',  intensionality, which is widespread in natural language, raises a number of meaning detection issues that cannot be brushed aside. I will not repeat the arguments here, as I think they are clearly expounded in the paper.


4. Coverage of different types of text is essential. Anyone can build a model that deals with only their ten favorite sentences. That is not what we're talking about, when we talk about a model. Models need, in principle, to be able to deal with any sentence we throw at them.

Now, a controversial one.

5. Models need to be compositional. This will not bother my kind of mathematicians, as category theorists do believe that compositionality is key to all modelling we do, but it will be controversial to some. So I will postpone this side of the conversation for a little while.

 

No comments:

Post a Comment